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THE PREDICTION OF COLLEGE GRADES FROM 
PERSONALITY AND APTITUDE VARIABLES’ 


JOHN L. HOLLAND 


National Merit Scholarship Corporation, Evanston, Illinois 


This study was designed primarily 
to explore the usefulness of nonintel- 
lectual factors in predicting college 
grades and to provide information for 
the development of a theory of aca- 
demic achievement. Previous studies 
have tested the validity of the Cali- 
fornia Psychological Inventory, the 
Scholastic Aptitude Test, and high 
school grades (HSR) (Holland, 1958b, 
1959a). At this stage of investigation, 
we are still concerned with testing a 
large pool of nonintellectual variables 
assumed to be related to academic 
achievement before we attempt to de- 
vise a conceptual framework for order- 
ing these diverse variables. The pres- 
ent study tested the validity of the 
Sixteen Personality Factor Question- 
naire (16 PF), the National Merit 
Student Survey (NMSS), and the Vo- 
cational Preference Inventory (VPI) 
for predicting the freshman year 
grades of a sample of high aptitude 
students attending 277 colleges and 
universities. In addition, the predic- 
tive validities of the Scholastic Apti- 
tude Test (SAT) and High School 
Rank (HSR) were tested. 


MeETHOD 


Student Sample. The sample consisted of 
% of a one-sixth random sample (641 boys, 





*This study was partially supported by 
the National Science Foundation and the 
Old Dominion Foundation. The author 
Wishes to thank Donald L. Thistlethwaite 
and Laura Kent for their editorial assistance. 


311 girls) drawn from 7,500 Finalists, the 
survivors of a national competition in which 
255,942 high school seniors participated 
(National Merit Scholarship Corporation, 
1958). The 16 PF, NMSS, and VPI were 
administered to this student sample about 
3 months before the fall term of college. The 
SAT scores and background data were ob- 
tained about 7 months before the fall term. 
Although an attempt was made to obtain a 
sample which would be representative of the 
Finalists in the National Merit program, the 
original return of 83% was reduced to 65% 
because of the loss of criterion data (grades) 
due to changes in students’ addresses and the 
failure of some colleges to respond to our 
request for grades. 

The present sample is similar to the 
sample reported on earlier (Holland, 1959a) 
with respect to scholastic aptitude, high 
school achievement, and family background. 
The means on the SAT Verbal and Mathe- 
matical factors, respectively, are: for boys, 
6753 and 707.6, with standard deviations of 
57.4 and 68.9; for girls, 685.2 and 641.9, with 
standard deviations of 58.5 and 72.9. 

Predictors. Form A of Cattell’s 16 PF test 
was used. This inventory is well known and 
has been described in a number of publica- 
tions (Cattell, 1957; Cattell, Saunders, & 
Stice, 1957). The NMSS is an experimental 
personality inventory devised from a review 
of the literature” It includes 10 internally 





*The National Merit Student Survey was 
constructed by the author and Donald L. 
Thistlethwaite. We wish to acknowledge our 
indebtedness to our staff for assistance in 
suggesting ideas for items. Similarly, we are 
indebted to Calvin W. Taylor, Benno G. 
Fricke, and Morris I. Stein for items and 
ideas. 
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consistent scales made up of true-false, 
forced-choice, and multiple-choice items 
which are intended to measure some of the 
more important personality and attitudinal 
variables related to academic achievement. 
The scales are: Dedication to Scholarship, 
Dependence, Dominance, Play, Intellec- 
tualism, Introversion, Parental Press, Per- 
sistence, Superego, and Tolerance for Am- 
biguity. The average item-total score 
correlations for all scales and both sexes 
range from 37 to .56, with a mean of 42. The 
VPI, an experimental personality inventory 
composed of occupational titles, is a revision 
of the Holland Vocational Preference Inven- 
tory, which has been described elsewhere 
(Holland, 1958a). It consists of the follow- 
ing scales: Acquiescence, Infrequency, Physi- 
cal Activity, Intellectuality, Responsibility, 
Conformity, Verbal Activity, Emotionality, 
Aggressiveness, Control, Masculinity, and 
Status. 

Criterion. Freshman grades in college, or 
honor point ratio (HPR), were used as the 
criterion of scholastic achievement. The 
grading systems of all the colleges were con- 
verted to HPR by means of a standard 
formula’ Generally, this formula was ap- 
plied to collegiate grading systems by using 
the equivalences given in college regulations 
or in letters from the colleges. In a few 
instances, numerical values had to be as- 
signed to letter grades on the basis of the 
investigator’s judgment. 

Analysis. College grades were correlated 
(Pearson product-moment) with the 16 PF, 
NMSS, VPI, HSR, and SAT variables for 
student samples which had been classified 
in four different ways. First, correlations 
were computed for the total male and female 
samples. Second, the colleges attended by 
the male and female samples were dichoto- 
mized, using their Talent Supply Index 
(TSI), and correlations were obtained for 
samples of students attending institutions of 
“high” and “low” talent supply. TSI is an 
estimate of the average scholastic aptitude 
(SAT) of a college’s student body and was 
derived in an earlier study (Thistlethwaite, 
1959a). The TSI correlates .74 to .76 with 





* All grades were converted by an honor 
point ratio formula where A = 4, B = 3, C = 
2, D = 1, and F = 0; grade values were 
added and divided by the number of courses. 
This simplification for a sample of 582 stu- 
dents correlates 989 with the more laborious 
formula in which credits per course are mul- 
tiplied by grades and divided by total credits 
carried. 
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the average SAT scores for a sample of 43 
colleges. Third, the male and female samples 
were classified by means of PhD productiv- 
ity indexes for Natural Science (NS) and 
for Arts, Humanities, and Social Science 
(AHSS) (Thistlethwaite, 1959a), and corre- 
lations were computed for student samples 
attending colleges which fall above and 
below the median on these two indexes. 
Finally, correlations between grades and pre- 
dictors were obtained for the two institu- 
tions with the largest student samples. 


RESULTS 


Table 1 shows the correlations be- 
tween freshman college grades and the 
40 predictor variables (HSR, SAT-V, 
SAT-M, 16 PF, NMSS, and VPI) 
for the total samples of males and fe- 
males and for the high and low TSI 
samples (the colleges having high vs. 
those having low aptitude levels). The 
correlations for the total male and fe- 
male samples were computed as a 
means of estimating the efficiency of 
the TSI method (Table 1) and of the 
PhD indexes and individual college 
analyses (Tables 3 and 4). It was as- 
sumed that the use of the TSI would 
increase the correlations between the 
predictors and college grades, since 
grouping colleges by estimated apti- 
tude level of the freshman class would 
tend to equate grades among colleges 
without any substantial change in the 
variability of grades. This procedure 
follows what Bloom and Peters (1959) 
have called the “aptitude method”— 
applied to the criterion (college 
grades) instead of the predictors (high 
school grades)—to yield better pre- 
dictions. 

Generally, the TSI analysis pro- 
duced higher correlations for the SAT 
and HSR variables against college 
grades, although most of the correla- 
tions are low because they are still 
computed across colleges. With re- 
spect to personality variables, the cor- 
relations for the samples dichotomized 
by the TSI show, in general, only 
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TABLE 1 





CORRELATION OF COLLEGE GRADES WITH INTELLECTUAL AND NONINTELLECTUAL 
PREDICTORS FOR ToTAL SAMPLES AND FOR SAMPLES CLASSIFIED BY TALENT 


Suppty INpDEx 








Total Sample* 





Variable 
M 
HSR 40** 
SAT-V 09* 
SAT-M 13°° 
NMSS 
1. Scholarship 07 
2. Dependence | 
3. Dominance 08* 
4. Play —25** 
5. Intellectualism 08* 
6. Introversion 08* 
7. Parental Press 01 
8. Persistence 19** 
9. Superego 21** 
10. Tol. for Ambig. 06 
16 PF 
11. A Sociable 01 
12. B_ Intelligent 00 
13. C Mature —03 
14. E Dominant —13** 
15. F Cheerful —20** 
16. G Persistent 11° 
17. H Adventurous —08* 
18. I Effeminate 13°° 
19. L Paranoid 01 
20. M Introverted 04 
21. N Shrewd —13** 
22. O Insecure —01 
23. Q, Radical —07 
24. Qs Self-Sufficient Sa 
25. Q; Controlled 06 
26. Q, Tense 02 
VPI 
27. Infrequency 08 
28. Physical Activity —04 
29. Intellectuality 01 
30. Responsibility 05 
31. Conformity —03 
32. Verbal Activity —07 
33. Emotionality 09* 
34. Aggressiveness —03 
35. Control 33°° 
36. Masculinity —05 
37. Status 05 
N 641 








Talent Supply Index 














Males Females 
F High Low High Low 
40** 40** 55** 32** 42** 
02 16** 19°° 06 22°* 
04 14* 21** —03 23°* 
02 16** 05 05 10 
ll 05 13* 07 12 
—0Ol 14* 03 05 02 
06 —27** —27** 05 —32** 
00 11* 07 05 —0Ol 
—14* 09 a7? —17* 06 
—06 05 —(H 11 —15 
23°* a ates 14 35** 
25°** 14* 18** 18* 26** 
—09 03 07 —03 01 
—04 03 00 —05 01 
—O4 00 04 —15 15 
01 —08 01 07 —10 
—19** —08 —14* —0Y¥ —19* 
—09 —19** —22** —0l —15 
aia 07 11* 17* 18* 
—12* —07 —10 —02 —13 
O4 13* 12* 04 03 
—0l —01 02 —02 —04 
—13* —01 14* —01 —13 
—06 —10 —17** 00 —09 
—08 02 —03 —10 —08 
—19** —0l —09 —14 —13 
—06 14* 12* —05 00 
14* 2 09 07 a 
—07 05 —03 —09 —09 
09 35° 02 15 —02 
—02 —06 —09 —05 —(04 
04 02 02 —05 10 
05 03 09 05 00 
06 —03 —02 03 —02 
—08 —09 —05 —10 —10 
01 07 10 02 —02 
—06 —07 04 —08 —04 
12* 16** 10 25** 05 
— 04 —02 —08 00 —l1 
00 07 08 12 —09 
311 323 318 155 156 





* Correlation coefficients for the nonintellectual predictors for the total sample have been partialed out for the effect 

of aptitude (SAT-M). The average difference between the zero order and partial correlations is less than .01. 
* Significant at .05 level. 
** Significant at .01 level. 
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TABLE 2 
VARIABLES CORRELATED WITH HigH ScHoot RANK 
(SAT-M partialed out) 








Boys 
Variable 





.28 


TR Maturity 


28. Physical Activity — .22 
4. Play —.21 
23. Radicalism —.18 
37. Status .19 
9. Superego .21 
8. Persistence 16 
SAT-M 


17 





negligible differences from the correla- 
tions for the total samples. 

When the correlations at the 1% 
level in Table 1 are interpreted, they 
reveal that the college achiever has 
done well in high school (HSR) and 
has high scholastic aptitude. The non- 
intellectual predictors shown in Table 
1 characterize the male achiever as 
dependent (2), serious (not playful) 
(4), persistent (8), responsible (9), 
submissive (14), quiet (15), persistent 
(16), feminine (18), naive (21), self- 
sufficient (24), and _ self-controlled 
(35). The female achiever is char- 
acterized as persistent (8), responsible 
(9), submissive (14), persistent (16), 
and conservative (23). 

Perhaps the most important finding 
in Table 1 and succeeding tables is 
that HSR is, in general, consistently 
superior as a predictor of college 
grades (r’s range from .32 to .55). The 
greater efficiency of HSR in this study 
as compared to its lesser efficiency in 
an earlier study (Holland, 1958b) is a 
function of the greater variances in 
HSR and HPR in the present study. 
To explicate the predictive efficiency 
of HSR, the 39 predictors and a 
teacher rating of “Maturity” were in- 
tercorrelated with HSR. This analysis 
was performed with smaller repre- 
sentative samples selected by machine 
from the total samples (boys = 148, 


Girls 

Variable 
32. Verbal Activity —.21 
35. Control .19 
SAT-M 18 
37. Status — .18 
34. Aggressiveness — .16 
29. Intellectualism 15 
24. Self-Sufficiency 17 








girls 140). The variables signifi- 
cantly correlated with HSR, when 
SAT-M is partialed out, are shown 
in Table 2. 

These correlates imply that the boy 
with high HSR is characterized by a 
number of personal traits in addition 
to his drive to achieve academically. 
For instance, he is often rated high 
by his teachers. The Maturity rating 
is regarded as a measure of the degree 
to which students are rated high or 
low by school personnel, since this 








I 


variable has the highest average cor- | 


relation among the 12 ratings used 
which are closely related (Holland, 
1959b). He apears to be somewhat 


feminine (dislikes Physical Activity), | 


serious (as opposed to playful), con- 


servative, aspiring, responsible, per- 


sistent, and intelligent. The corre- 
lates for girls suggest that high HSR 
is associated with submissiveness (also 
unsociability), self-control, 
gence, self-deprecation, passivity, in- 
tellectuality, and self-sufficiency. 
Taken together, these correlates 
suggest a plausible explanation for 
the predictive efficiency of HSR. The 
highest correlates of HSR are gen- 
erally the best predictors of college 
grades. These correlates indicate then 
that HSR is a function of high aca- 
demic aptitude, an academic personal- 
ity syndrome, and positive relation- 
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ships with teachers. That a complex 
measure encompassing such variables 
is an efficient predictor of college 
grades is clear from these data. 

In order to determine how the per- 
sonality and aptitudes of the academic 
achiever are related to the atmosphere 
and PhD productivity of his college, 
a third analysis was performed by 
classifying the colleges attended by 
the total sample as high or low on 
the NS and AHSS indexes (Thistle- 
thwaite, 1959a), and by recomputing 
the grade-predictor correlations for 
students attending colleges which fall 
above and below the median on these 
indexes. The NS and AHSS indexes 
are estimates of a college’s production 
of undergraduates who later attain 
the PhD, when colleges are equated 
for talent supply by means of a re- 
gression analysis, 

It was assumed that at colleges most 
productive of PhDs, high grades are 
obtained by students with great po- 
tential for achievement and creativity, 
whereas students who get high grades 
at less productive colleges have less 
potential for achievement. It is as- 
sumed too that more productive col- 
leges also have atmospheres which 
demand personality traits associated 
with creative performance and that 
the criterion-predictor correlations will 
be related to the students’ descriptions 
of these institutions, as defined by 
their responses to the College Char- 
acteristics Index (CCI) (Thistle- 
thwaite, 1959a, 1959b); that is, it is 
expected that students who get high 
grades will resemble the typical stu- 
dent in terms of personality and will 
possess personal qualities compatible 
with the teaching practices and per- 
sonalities of the faculty. Table 3 pre- 
sents the results of these analyses. 
The differences in aptitude levels for 
groups of colleges with high and low 
indexes are generally small or insignif- 
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icant; three of the eight comparisons 
of mean SAT scores are insignificant 
and the remaining five significant dif- 
ferences between high and low index 
institutions range from 1/11 to 1/5 of 
a standard deviation. Also, 85% of the 
variances between the high and low 
index samples are insignificant by F 
test. 

If high and low index student sam- 
ples are compared by listing those 
personality variables which are related 
to academic achievement in one sam- 
ple but are of less importance in the 
other sample, the following pattern of 
results is obtained. For boys attending 
institutions high on the NS index, high 
grades are associated with dedication 
to scholarship (1), dominance (3), 
superego (9), emotionality (33). High 
grades at low index institutions are 
associated with submissiveness (14), 
timidity or lack of adventurousness 
(17), naivete (21), and passivity (32). 
Several variables—HSR, Play (4), 
Persistence (8), Cheerful (15), Self- 
Sufficient (24)—retain in both samples 
the relatively high correlations of the 
total sample. 

For girls, high grades for the high 
index sample (NS) are associated with 
dependence (2), extraversion (6), in- 
tolerance for ambiguity (10), submis- 
siveness (14), conventionality (20), 
confidence (22), conservatism (23), 
self-control (25), and lack of tension 
(26). High grades at low index insti- 
tutions are associated with control 
(35). 

The patterning of predictors for the 
samples on the AHSS index is some- 
what similar, but the interpretation of 
the differences between high and low 
index samples is less clear. 

These findings appear to be con- 
gruent with a few of the student and 
faculty press characteristics associ- 
ated with NS productivity (Thistle- 
thwaite, 1959a, 1959b). For example, 
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CORRELATION OF COLLEGE GRADES WITH PREDICTORS FOR STUDENT SAMPLES CLASSIFIED 


JOHN L. HOLLAND 


TABLE 3 


BY CoLLeEGE Propvuctivity INpEx (PHDs) 








Science Productivity 


|Arts and Humanities Productivity 












































Variable | Males Females Males Females 
High Low Low | High | High | Low 
HSR 38**| 41**| 4 46**) ogee! aor 39** 31**| 43** 
SAT-V 11*| 07 |-02 | 08 | 05 | 10 |-o1 | 12 
SAT-M 16**| 07 | 11 |-08 | 07 | 14*/ 00 | 13 
NMSS Dla } | 
1. Scholarship 11* | 05 |-—12 10 10 06 03 01 
2. Dependence | 10 11* | 16* | 04 15**| 06 | 15 06 
3. Dominance 13* | 02 10 |—09 08 06 |—06 07 
4. Play —23**|—28**; 03 |—14 |—27**|—22**| 05 |-—21** 
5. Intellectualism 08 | 08 | 05 |-04 | 08 | 05 | 06 |-04 
6. Introversion 06 10 |—24**|—05 12* | O1 |—25**| 05 
7. Parental Press 02 |—01 |-06 |-06 | 00 | 00 | 05 |—22** 
8. Persistence 18**| 22**| 26**) 20* | 23**) 15**| 22**| 25** 
9. Superego 23°*| 14* 27**| 23°*] 17°* 22**| 28**| 18* 
10. Tol. for Ambig. 01 05 |—16* 02 01 02 |-12 01 
16 PF 
11. A Sociable 01 | 01 | 03 |-08 |-08 | 09 | O01 |-09 
2. B Intelligent 03 |—02 05 |-14 08 |-—07 |-11 11 
13. C Mature —(H 00 02 02 01 |—04 04 |—09 
14. E Dominant —07 |—17**|—20* |-14 |—12* |—12* |—28**|/—04 
15. F Cheerful —20**|—20**|-01 |-10 |—22**/-17**|-08 |-11 
16. G Persistent 13* | 09 17* | 16°] 12°] 13° | 27° @ 
17. H Adventurous —03 |-—17**|—10 |—10 |—20**; 00 |-10 |—11 
18. I Effeminate 10 10 04 04 07 09 04 05 
19. L Paranoid 08 j—05 |-—01 |-—06 |-—03 05 |—07 02 
20. M Introverted 00 09 |-—27**) Ol 00 05 |-—21**; 00 
21. N Shrewd —09 |-—17**|—12 |-—0O1 |-—03 |-—21**/—13 |—02 
22. O Insecure 02 |-—04 |-—22**] O1 |-—04 02 j-11 |-—05 
23. Q: Radical —10 |-—07 |—28**/—12 |-—06 |—12* |—23**/—12 
24. Qs Self-Sufficient 12* 11* |—15 01 16**| 07 |—08 01 
25. Q; Controlled 06 7 18* | 10 12* | 02 18* | 08 
26. Q, Tense 05 |-—01 |—16* |-—07 01 03 |-06 |-11 
VPI 
27. Infrequency 05 07 12 08 05 04 08 07 
28. Physical Activity —04 |-07 |-01 |-03 |-—02 |-—05 |—05 01 
29. Intellectuality 05 00 00 07 03 04 |—02 12 
30. Responsibility 0s | 02 | 03 | o9 | of | 03 | 06 | 02 
31. Conformity 02 |—06 09 04 |—05 02 |—06 11 
32. Verbal Activity —03 |—11* |-—02 |-07 |—13* |-02 |-15 |—04 
33. Emotionality 11*| 03 |-07 | 08 | 05 | 08 |—03 | 03 
34. Aggressiveness 02 |-—08 |—06 |-—03 |-—05 |-—0l1 |-—12 |-Ol 
35. Control 13*| 09 | 08 | 20*| 13*| 08 | 21**| 02 
36. Masculinity —04 |-03 | 04 |-07 |-01 |-03 |-08 |-O1 
37. Status 09 j-01 | O1 | Il | 02 04 | Ol |—03 
N | 330 | 311 | 155 | 156 | 316 | 325 | 159 | 152 
* Significant at .05 level. 


** Significant at .01 level. 
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the qualities of the male achiever in 
NS seem consistent with the student 
press for “Aggression” and “Empha- 
sis on High Academic Standards” 
typical of high index institutions and 
the absence of these presses in low 
index institutions, which have presses 
for “Closeness of Supervision” and 
“Directness of Teaching Methods” 
and which reward students character- 
ized by passivity and dependence. For 
the female samples and for both males 
and females on the AHSS index, the 
relation between the personality of the 
achiever and the institutional press 
is obscure and occasionally contra- 
dictory. 

Table 4 shows the correlations be- 
tween grades and predictors within the 
two individual colleges having the 
largest student samples. This analysis 
was performed to determine the effect 
of a more reliable criterion (the grad- 
ing system of a single college rather 
than the equating of multiple systems) 
and to explore the relationships be- 
tween the significant predictors for 
each college and the college environ- 
ment as assessed earlier by the CCI 
(Thistlethwaite, 1959b). 

For Harvard students, the signifi- 
cant predictors of college grades sug- 
gest that the student who does well 
academically has a high HSR and is 
unstable (13), feminine (18), naive 
(21), anxious (22), and emotional 
(33). The results for MIT students 
imply that academic achievement is 
again related to HSR, but in contrast 
to the Harvard sample, superego (9) 
and persistence (16) are of greater 
importance. 

These predictor patterns are inter- 
esting when related to the earlier ob- 
servations of these institutions made 
by students on the CCI (Thistle- 
thwaite, 1959b). The largest mean 
seale difference on the CCI for these 
colleges implies that Harvard is char- 
acterized by high Emotionality (“in- 
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TABLE 4 

CoRRELATION oF CoLLEGE GRADES WITH 

PREDICTORS FOR STUDENTS AT 
HARVARD AND MIT 
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Variable Harvard MIT 
HSR 40** 46** 
SAT-V 22 10 
SAT-M 18 27 
NMSS 
1. Scholarship 18 07 
2. Dependence —03 19 
3. Dominance 10 06 
4. Play —23 —27 
5. Intellectualism 10 —24 
6. Introversion 19 —28 
7. Parental Press 19 10 
8. Persistence 16 12 
9. Superego 12 38* 
10. Tol. for Ambig. 16 06 
16 PF 
11. A Sociable —06 27 
12. B Intelligent 11 —07 
13. C Mature —28* 01 
14. E Dominant —04 00 
15. F Cheerful —05 —06 
16. G Persistent 07 45** 
17. H Adventurous —12 09 
18. I Effeminate 29* 02 
19. L Paranoid —O 24 
20. M Introverted 15 —27 
21. N Shrewd —28* 02 
22. O Insecure 25* 34 
23. Q; Radical 01 — 28 
24. Qs» Self-Sufficient 03 15 
25. Q; Controlled —06 —30 
26. Q, Tense 18 27 
VPI 
27. Infrequency 19 —17 
28. Physical Activity -—18 01 
29. Intellectuality —15 06 
30. Responsibility 05 —08 
31. Conformity 02 —15 
32. Verbal Activity —08 —28 
33. Emotionality 28* —19 
34. Aggressiveness 07 —13 
35. Control 02 —05 
36. Masculinity —21 —05 
37. Status 13 —18 
N 62 38 


























* Significant at .05 level. 
** Significant at .01 level. 
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tensive, active, emotional expression 
versus... restrained responsiveness”’) 
and MIT by low Emotionality. These 
results are consistent with the finding 
that achievers at Harvard are unstable 
(13) and emotional (33), and that 
achievers at MIT are persistent and 
conscientious (16). 

These interpretations are tenuous 
because of the difficulty in equating 
environmental variables with personal 
attributes, the unrepresentativeness of 
the small student samples, and the 
fact that statistical tests of the as- 
sumed interactions were not per- 
formed. They do suggest, however, 
some promising hypotheses for future 
testing with larger samples. 


DIScUSSION 


The patterns of significant pre- 
dictors found in the present study are 
consistent with previous research, al- 
though the individual correlations in 
Tables 1, 3, and 4 should be inter- 
preted cautiously because of the sam- 
pling biases created by the loss of 
predictor and criterion data, the un- 
known biases which occur in student 
participation in the National Merit 
program, the high aptitude levels of 
the samples, and the experimental] na- 
ture of the NMSS and VPI invento- 
ries. The overall superiority of HSR 
is well documented and this investiga- 
tion reports some of the correlates 
which have been generally assumed to 
account for the efficiency of HSR. 

The most effective nonintellectual 
predictors for the total samples, Su- 
perego (9), Persistence (8), and Play 
(4), are consistent with earlier find- 
ings about the correlations between 
grades and CPI seales (Holland, 
1959a). In that study, the Socializa- 
tion, Responsibility, Social Presence, 
and Self-Control scales of the CPI 
were found to be the best predictors. 
Similarly, a number of somewhat less 
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effective predictors appear compara- 
ble in both studies and are replicated 
in as many as three different inven- 
tories, as in the case of “control” and 
“self-control.” 

The investigation of academic 
achievement for different college at- 
mospheres (Table 3) reveals that dif- 
ferent kinds of colleges reward differ- 
ent kinds of students, but the specific 
findings are unclear since the PhD 
productivity indexes are at best only 
crude approximations of the institu- 
tional atmospheres. More desirable 
studies would entail large student 
samples and single college-by-college 
comparisons. 

The characterization of the achiever 
by the 16 PF scales is of special in- 
terest, since Cattell’s description 
(Cattell & Drevdahl, 1955; Drevdahl, 
1956; Drevdahl & Cattell, 1958) of 
the “creative” person as intelligent, 
emotionally mature, dominant, ad- 
venturous, emotionally sensitive (fem- 
inine), introverted, radical, self-suf- 
ficient, tense, unsociable, depressive, 
less subject to group standards, and 
impulsive is clearly at odds with the 
present results. Only two of the 16 
PF scales indicative of creative po- 
tential are correlated with grades in 
the expected direction—emotionally 
sensitive, feminine (I) and surgent 
(F)—while five scales are signifi- 
cantly correlated with grades in di- 
rections which suggest a lack of cre- 
ative potential and the remaining five 
scales characteristic of “more crea- 
tive” people are not significantly re- 
lated to grades. The implication that 
the college achiever has less poten- 
tial for creative activity is supported 
by our findings about the correlates 
of HSR (see Table 2). A recent study 
of the correlates of teacher ratings— 
in which a teacher rating of Maturity, 
representative of a set of 12 ratings, 
was related to the 16 PF scales—also 








ves 
ou: 
arg 
scl 


un 
Sor 
low 


dic 


ter, 
vel 
ent 
tho 
rev 
the 
reir 
val 
of r 





n- 


if- 


at 








indicates that students with high HSR 
may have less creative potential than 
students with low HSR, assuming that 
the latter also have other attributes 
associated with creative behavior 
(Holland, 1959b). (Though the pres- 
ent sample was included in this study, 
the samples are not identical, because 
of the shrinkage of the present sample 
through the loss of criterion data.) 
MacKinnon’s report (1959) that re- 
search scientists and architects se- 
lected for study as exceptionally crea- 
tive have had undistinguished college 
grades is similarly consistent. 

The attributes of potential for crea- 
tivity outlined by Cattell are still 
tenuous, since we lack a predictive or 
longitudinal test of his findings. How- 
ever, the work of Barron (1957), 
Woodworth (1958), Getzels and Jack- 
son (1958), and Stein (1957) provide 
additional evidence for the syndrome 
which Cattell found in his researches. 
The impulsive (playful and Pd qual- 
ities), dominant, radical attributes of 
this hypothetical disposition are to 
some extent confirmed by one or more 
of these studies. 

The implications of the present in- 
vestigation, which are consistent with 
our growing knowledge of creativity, 
argue against the uncritical use of high 
school and college grades as predictors 
of post-college achievement and as 
unqualified criteria for selecting per- 
sons for admissions, scholarships, fel- 
lowships, or jobs. Similarly, the pre- 
diction of college grades appears to be 
an increasingly dubious research en- 
terprise. It seems preferable to de- 
velop more valid criteria of independ- 
ent achievement and creativity, even 
though colleges may not recognize and 
reward these tendencies. To continue 
the prediction of college grades only 
reinforces their somewhat specious 
validity and delays the development 
of more adequate criteria and the sub- 
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sequent re-examination of educational 
goals and practices. 


SuMMARY 


The nonintellectual predictors of 
first-year college grades have been 
studied by means of the 16 PF; the 
NMSS, a personality inventory con- 
structed from a review of the litera- 
ture; and the VPI, a personality in- 
ventory consisting of occupational 
titles; as well as the SAT and HSR. 
The results for large samples of tal- 
ented students attending 277 institu- 
tions suggest that nonintellectual 
variables such as Superego, Persist- 
ence, and Deferred Gratification (the 
opposite pole of the Play scale) are 
useful in prediction and in under- 
standing the nature of the academic 
achiever. An empirical explication 
for the overall superiority of HSR 
as a predictor of college grades in this 
study was provided by correlating the 
aptitude, teacher rating, and nonin- 
tellectual variables with HSR. Stu- 
dent samples attending colleges with 
different atmospheres in terms of PhD 
productivity and student and faculty 
press as measured by the CCI vari- 
ables were also studied, and it was 
found that colleges with different at- 
mospheres reward different kinds of 
students. 
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A COMPARISON OF PUPIL AND TEACHER PERCEPTIONS 
OF PUPIL PROBLEMS 


ROBERT T. AMOS AND 


Rhode Island College of Education 


In practice the basis for appraisal 
of certain aspects of personality is sub- 
jective. This is especially true in the 
area of social development where few 
if any diagnostic instruments are em- 
ployed and teachers’ judgments must 
be based almost entirely on personal 
observations. The validity of such ob- 
servations, however, is difficult to 
prove or disprove since what is viewed 
to be problem behavior by one may 
not be considered as such by another. 
Previous investigators (Thompson, 
1940; Wickman, 1928) have found that 
judgments of teachers differ from those 
of clinicians and child psychologists 
with regard to the kinds of behavior 
problems which were considered ser- 
ious; others (Gage & Suci, 1951; Gron- 
lund, 1950) have found that the ability 
to judge the behavior of pupils ac- 
curately is positively related to the 
teacher’s effectiveness with them. 

The present study compares pupil 
and teacher perceptions of pupil prob- 
lems to determine whether the prob- 
lems which teachers recognize in pupil 
behavior agree with those which the 
pupils themselves identify. 


METHOD 


All 21 of the homeroom teachers in a 
junior high school in the District of Co- 
lumbia participated in the study as did 87 
pupils whom these teachers identified as 
pupils with problem behavior. The teachers, 
12 women and 9 men, were requested to 
identify those pupils whom they considered 
“problems” after having had them in their 
respective homerooms for 8 months and to 
rate the pupils on the Mooney Problem 
Check List. An identical form consisting of 
210 items representing a variety of problems 
was administered to the 87 pupils in group 
situations by the guidance counselor. Of 
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these 87 pupils identified, 56 were boys and 
31 were girls. Sixty of these were divided 
equally between the seventh and eighth 
grades and 27 were in the ninth grade. 


RESULTS 

The difference (see Table 1) between 
the average number of items checked 
by the pupils and teachers suggested 
that teachers fail to sense many adoles- 
cent problems. The pupils identified 
more problems than their teachers did 
in the area of School and considered 
this area their greatest concern. The 
teachers were apparently aware of the 
importance of such a problem area to 
adolescents although they checked 
items in the area of Self-Centered 
Concern to be the problems most rep- 
resentative of this group. The pupils 
checked problems in the area Money, 
Work, the Future second in frequency, 
whereas teachers seemed relatively in- 
sensitive to the significance of these 
problems to junior high school pupils. 
Their responses placed this area sixth 
in rank among seven categories. Com- 
pared to the adolescents, teachers also 
seemed relatively insensitive to the 
importance of the problems in the 
area of Health and Physical Develop- 
ment. 

The first several columns in Table 1 
compare the sexes. In general, differ- 
ences are similar to those for the total 
group, except in the instance of school 
problems where the difference between 
teachers and pupils is significant only 
in the case of girls. Differences were 
reliable between teachers and pupils at 
all ages in the areas of Money, Work, 
the Future, and Health and Physical 
Development, but at the seventh grade 





Mean Scores AND ¢ Ratios ror TEAcHER-PupiIt DIFFERENCES ON ITEMS 


TABLE 1 


CHECKED IN PROBLEM AREAS 
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Mean Scores— | 


Mean Scores— 





| 
| Mean Scores— 











| 
‘Number Boys irls | | ota | 
Problem Areas Items ; § | Ratios | & Ratios | e | Reties 
‘= am Boys | 3 | | Girls | 3 | Pupils 3 
le | eB | |e | 
| | | 
Health and Physical | 30 | 3.67 | 1.25 |4.15**| 5.03 | 1.61 |4.67**| 4.17 | 1.37 |6.77** 
Development | 
School | 30 | 5.51 | 6.12 | .67 | 6.03 | 4.22 |2.06* 5.70 | 5.47 | .41 
Home and Family Life | 30 | 3.16 | 2.42 |1.08 | 4.22 | 1.90 |2.60* | 3.46 | 2.24 |2.50* 
Money, Work, the Fu- | 30 | 5.42 | 1.92 14.83**! 5.51 | 2.29 |6.57**| 5.34 | 1.70 |7.00** 
ture 
Boy and Girl Rela-| 30 | 2.75| 1.69 |1.32 | 3.74 | 1.87 |2.17* | 3.00 | 1.75 - 
tions 
Relations to People | 30 | 3.50 | 4.42 |1.21 | 4.51 5.96 |1.14 | 3.86 4.97 - 
in General 
Self-Centered Con-| 30 | 5.08 | 5.62 |1.28 | 6.28 | 5.74] .45 | 5.17 | 5.66} .79 
cerns | | 
| | } | 
Number of subjects | 56 2 31 21. | 87 —s:«{21 | 
* Significant at .05 level. 
** Significant at .01 level. 
level, when the responses of girls and ScHoot: 


boys were separated, differences ap- 
peared in two additional areas. There 
was increasing awareness with grade 
on the part of the pupils regarding 
problems in the areas of Boy-Girl Re- 
lations and Money, Work, the Future. 

Listed below are specific items on 
which significant differences occurred, 
thus suggesting some particulars in 
misperceptions of teachers. Because of 
space limitation, only those items 
which teachers or pupils checked de- 
scribing 25% or more of the pupils 
were selected and those for which the 
difference between teachers and pupils 
was significant. Teachers exceeded pu- 
pils on number of times those items 
preceded by an asterisk were checked; 
otherwise pupils exceeded teachers. 
With the exceptions noted, differences 
held for both sexes. In practically all 
instances listed differences between 


teachers and pupils were significant 
at the .01 level of confidence. 





Afraid of failing in school 

Worried about grades 

Trouble with arithmetic 

*Don’t like to study 

*Not interested in books 

*Not spending enough time in study (girls 
only) 


Money, WorRK, THE FUTURE: 


Wanting to buy more of my own things 
Needing a job during vacation 
Wanting to earn some of my own money 
Spending money foolishly 

Having to ask my parents for money 
Needing to find a part-time job 
Deciding what to take in high school 


SELF-CENTERED CONCERN: 


Forgetting things 

Trying to stop a bad habit 

Being punished for something I didn’t do 
Being nervous 

Being afraid of making mistakes 
Sometimes wishing I’d never been born 
*Lacking self-control 
*Not taking things seriously enough (boys 

only) 
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TEACHERS’ PERCEPTION OF PUPIL PROBLEMS 


*Sometimes not being as honest as I 
should be 

*Giving in to temptations 

*Being lazy (boys only) 

*Being careless 


PEOPLE IN GENERAL: 


*Being stubborn 
*Getting into arguments (boys only) 


HoME AND FaMILy: 


Worried about someone in the family 
Parents working too hard 


HEALTH AND DEVELOPMENT: 


Getting tired easily 


In the main, the teachers’ observa- 
tions tended to be confined to those 
problems which disrupt classroom or- 
der and procedure and threaten the 
position of the teacher. 

These results led to the conclusion 
that these homeroom teachers recog- 
nize pupils with problems, but the 
problems that they recognized were 
limited in scope compared to the range 
of problems which the pupils them- 
selves considered important. Whether 
the problems so reported by pupils in 
this study are “real” one cannot know, 
since what pupils have to say about 
themselves is always marked by some 
distortion. It is possible, too, that these 
teachers tended to be influenced in 
their judgments by their own problems 
and that they were likely to see more 
problems in areas in which they them- 
selves were psychologically insecure. 
We seriously question whether teacher 
rating of behavior is adequate to iden- 
tify the problems of a particular pupil 
without including the pupil’s own in- 
terpretations and without considering 
the marked difference between the ex- 
periential world of the teacher and 
that of the pupil. No teacher is ex- 
pected to become detached from his 
own experience and to develop a com- 
plete awareness of pupils’ feelings and 
attitudes. It seems apparent, however, 
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that the teachers in this study need to 
make more adequate judgments of the 
kind of problems that are characteristic 
of pupils in the area of Health and 
Physical Development and the kind of 
social and economic pressures which 
give rise to problems, or rely more on 
devices such as the Mooney Problem 
Check List administered to pupils. 


SUMMARY AND CONCLUSIONS 


To determine whether problems 
which teachers recognize in pupil be- 
havior agree with those which pupils 
themselves identify, 21 teachers in a 
junior high school and 87 pupils whom 
they had identified as behavior prob- 
lems, were asked to respond to the 
Mooney Problem Check List. Teachers 
identified fewer problems as character- 
istic of the students than did the stu- 
dents themselves, and appeared espe- 
cially unaware of the extent of student 
problems in the areas of Money, Work, 
the Future, and Health and Physical 
Development. Teachers’ judgments 
were more similar to those of boys 
than to those of girls, and likewise, 
more similar to the ninth grade than 
to those of the seventh grade. School 
problems, Work, and Self-Centered 
Concern appeared to demand the pu- 
pils’ attention, and the whole array 
of problems recognized by the pupils 
seems somewhat typical of those which 
other investigators (Arnold & Mooney, 
1943; Pflieger, 1947) have found to con- 
front similar junior high school pupils. 
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RELATION OF ACHIEVEMENT MOTIVATION TO 
ACADEMIC ACHIEVEMENT IN STUDENTS 
OF SUPERIOR ABILITY’ 
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The problem of underachievement 
has been of interest to psychologists for 
some time. Since the impact of the 
Space Age on educational philosophy 
and policy, particular concern has cen- 
tered on the underachievement of col- 
lege students of superior ability—es- 


} pecially those in the scientific and 








technical fields. Despite this interest, 
and despite considerable research de- 
voted to the topic, there has been little 
advance in isolating the nonintellec- 
tual factors associated with under- 
achievement, particularly when efforts 
are made to predict future performance 
on the basis of these variables. For ex- 
ample, the hypothesis has long been 
held, especially by educators, that un- 
derachievement is a manifestation of 
maladjustment, almost by definition. 
Most research (e.g., Berger & Sutker, 
1956; Burgess, 1956; Hoyt & Norman, 
1954) , however, has failed to show any 
difference in overall adjustment among 
over-, under-, and moderate-achieving 
students. Various biographic and dem- 
ographic variables have likewise failed 
to show consistent relationships with 
achievement (Asher & Gray, 1940; 
Morgan, 1952; Myers, 1953; Schultz 
& Green, 1953). 

Most perplexing, and vexing, of all 
has been the inability to relate actual 


? This study is based on part of a master’s 
thesis done by the senior author while a stu- 
dent at Purdue University under the guid- 
ance of Mark Stephens. Previous research, on 
which this study is based, was subsidized by 
the Purdue Research Foundation under 
Grant No. XR-1760. 
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achievement to measures of motivation 
to achieve. Such a relationship would 
seem virtually tautological; but the 
data have failed, nevertheless, to con- 
firm such a relation with consistency. 
The studies which have supported the 
hypothesis (e.g., Burgess, 1956; Liver- 
ant, 1958; Morgan, 1952) have been 
counterbalanced by those with contra- 
dictory results (e.g., Lowell, 1950; Mc- 
Clelland, Atkinson, Clark, & Lowell, 
1953; Parrish & Rethlingshafer, 1954). 

These failures are frustrating not 
only to those interested in dealing with 
the practical problem of underachieve- 
ment, but also to those whose primary 
concern has been with motivational 
theory and measurement ;).r se. Fail- 
ure of these studies has been blamed on 
the lack of sensitivity and validity of 
the measurement devices employed. 
But this only makes more obvious, and 
embarrassing, the crude state of mo- 
tivational theory which cannot gener- 
ate valid measurement operations. 

This study was directed simultane- 
ously at the problem of underachieve- 
ment itself and at the problem of meas- 
urement and theory of achievement 
motivation. The purposes were to (a) 
replicate and extend, with a well-con- 
trolled population of college students 
of superior ability, previous studies of 
the relationship between achievement 
motivation and academic achievement; 
(b) assess the relative predictive valid- 
ity of various measures of achievement 
motivation and theoretical conceptions 
thereof; and (c) assess the convergent 
validity of these measures. 








METHOD 


Theoretical Frameworks 


As suggested by Rotter (1958), the 
various motivational theories and 
measurement devices can be catego- 
rized as to their complexity into at 
least three discernibly different levels: 

The Simplest Conception. This is 
represented in many “common sense” 
theories, as well as many classical per- 
sonality theories, and represented psy- 
chometrically by McClelland’s need 
Achievement Scale (McClelland et al., 
1953). In these conceptions, the single 
variable of strength of need is the pre- 
dictor. The paradigmatic expression of 
such a conception would be: “The 
stronger the need, the stronger the be- 
havior”—or the more likely, or more 
frequent, the behavior. At this level, 
then, need is conceived and measured 
as an absolute quantity. 

A Slightly More Complex Concep- 
tion. Here the various needs are con- 
ceived as competing, so that theoretical 
and psychometric efforts are directed 
toward the relative strengths of vari- 
ous needs, rather than an absolute as- 
sessment of a single need. Psychomet- 
rically, the paradigm is represented by 
Edwards’ Personal Preference Sched- 
ule (1954). Conceptually it is repre- 
sented at least implicitly in some as- 
pects of Freudian and other personality 
theories and lay thinking. The basic 
assumption can be expressed as: “The 
strongest need at a given time will de- 
termine behavior.” 

The Most Complex Conceptions. 
These conceptions explicitly postulate 
other variables as interacting with 
(relative) need strength to determine 
behavior. These “other variables” can 
include previous learning, or rein- 
forcement, of specific behaviors, as in 
a Hullian system; various situational 
determinants of behavior, as suggested 
in field theory; or other quantifiable 
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contemporary psychological character- 
istics of the individual, such as his 
level of expectancy that a given be- 
havior will in fact lead to gratification 
of the need as in Tolman’s theory. 
Liverant’s Goal Preference Inventory 
(1958) represents a model allowing for 
interaction between need and situa- 
tion. Rotter’s theory (1954) accounts 
systematically for both situational de- 
terminants and expectancies, and in 
reference to the latter adds the varia- 
ble of Minimal Goal. The Minimal 
Goal equals the amount of reward 
(e.g., the level of grades) which con- 
stitutes a positively reinforcing state 
of affairs for the organism. The Mini- 
mal Goal is thereby distinguished from 
the (relative) strength of the need to 
attain that goal; Expectancy pertains 
to the probability of attaining the 
Minimal Goal. 

In his categorization of motivational 











theories, Rotter distinguishes those | 


which employ only one variable, such | 


as Expectancy, which interacts with 
need strength from those which em- 
ploy both situational and other “in- 
ternal” variables in multiple interac- 
tion. Such a distinction is of course 
legitimate. However, the categoriza- 
tion proposed herein seems adequate, 
and best suited, to the theories and 
measures of achievement motivation 
relevant to the problem of under- 
achievement. 


Measures 


Need. The two scores computed from the 
Edwards Personal Preference Schedule 
(EPPS) were those for n Ach (need Achieve- 
ment) and n Nur (need Nurturance); the 
latter was selected because it appeared to 
correspond most closely with the variable of 
need for love and affection in the Goal Pref- 
erence Inventory. The EPPS, therefore, af- 
forded measures of n Ach and n Nur, each 
in terms of the strength of that need rela- 
tive to the 13 other needs represented in the 
test. The hypotheses to be tested with EPPS 
scores were (a) that obtained grades would 
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be positively related to n Ach scores and 
(b) that they would be inversely related to 
n Nur scores. 

The Goal Preference Inventory (GPI) 
affords measures of the relative strengths of 
need for recognition-status and need for love 
and affection in two situational contexts: 
academic situations and social situations. 
The scores yielded are AcR (need for rec- 
ognition-status in academic situations), SoR 
(need for recognition-status in social situa- 
tions), and SLA (need for love and affection 
in social situations); each need-situation 
score is the complement of the sum of the 
other two scores. Predictions were that 
grades would be positively related to AcR 
and negatively related to SLA. 

The Incomplete Sentences Blank (ISB) 
was administered following standard instruc- 
tions; but responses were scored for Need 
Value (NV) of and Freedom of Movement 
(FM) (expectancy) for academic achieve- 
ment. The ISB was used to allow for com- 
parison of results from a semistructured de- 
vice with those of objective inventories. It 
also afforded a disguised, or indirect, meas- 
ure of expectancy for success. The ISB proto- 
cols were scored according to a manual 
which had been derived in previous research. 
Each response which seemed even indi- 
rectly pertinent to academic achievement 
was scored separately for NV and for FM, 
or expectancy of success; a 5-point scale was 
used, a high score indicating low NV or FM. 
Scoring for NV followed, by instruction to 
judges, the model of assessing NV for aca- 
demic achievement as relative to other 
needs, although no other needs were scored 
as such. Scoring for FM was essentially on 
an absolute basis, indicative of the subject’s 
apparent expectancy, or subjectively de- 
termined probability level, that he would at- 
tain whatever level of performance defined, 
for him, success. Interjudge reliability was 
.71 for NV and .58 for FM. Predictions from 
ISB scores were that grades would be posi- 
tively related to both NV and FM; no 
prediction was made concerning the relation- 
ship between FM and NV in this popula- 
tion. 

In addition, subjects were asked simply 
and directly to state (a) the grade-point 
average they expected to make during the 
current semester (E), (b) the lowest grade- 
point average they would consider at least 
minimally satisfying (MG), and (c) the 
highest grade-point average they thought 
they could make “if you worked as hard as 
you could” (Max). The fourth measure of 
need strength, then, was computed by sub- 
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tracting the “expected” grade from the 
“maximum” grade; the rationale was that a 
subject who expected to get considerably 
lower grades than he believed he “could” 
make could be assumed not to be highly mo- 
tivated toward maximum performance. This 
measure was assumed to be crude and fairly 
undisguised, and did not afford an assess- 
ment of NV as relative to other needs; nor 
did it involve any correction for Minimal 
Goal. Nevertheless, the prediction was that 
grades would be directly related to this dis- 
crepancy score. 

Expectancy. Aside from the ISB index of 
FM, the only two measures of expectancy 
were derived from the subjects’ direct ex- 
pectancy statements described above. The 
first of these was the direct report of the 
expected grade-point average. This actually 
does not correspond to Rotter’s construct of 
FM, which pertains to expectancy of suc- 
cess—that is, for achieving at or above the 
Minimal Goal. However, it was assumed that 
there would be some consistency across sub- 
jects as to Minimal Goal, so that the higher 
the individual’s expected grades, the more 
likely he expected to attain his Minimal 
Goal. The second expectancy measure cor- 
responds more directly to Rotter’s theoreti- 
cal model: it was attained by subtracting the 
Minimal Goal level from the expected 
grades. With each of these expectancy meas- 
ures, the prediction was of a direct relation- 
ship to grades. 

Minimal Goal. The sole measure of MG 
available was the subject’s direct statement 
of it. Scoring ISB protocols for MG esti- 
mates was contemplated, but the responses 
did not seem to lend themselves to such in- 
ferences. Again, a direct relationship between 
grades and minimal goals was predicted. 


Subjects 


The subjects were 72 freshman college 
students of a midwestern state university, 
each of whom had been granted by the uni- 
versity a Special Merit (honorary) Scholar- 
ship. Such a scholarship constitutes recogni- 
tion by the university of very superior 
students who do not need financial aid to ob- 
tain their education. All but one of the 72 
students took the Scholarship Qualifying 
Test (SQT) one year before entering college. 
Each student ranked in the top 10% of his 
high school graduating class; all were single 
and between the ages of 17 and 19. The 
group, then, was quite homogeneous as to 
aptitude (SQT), previous scholastic perform- 
ance (high school grades), socioeconomic 
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status (requiring no financial aid), age, and 
marital status. 


Procedure 


Students were asked to make the three 
grade estimates first. They then completed 
in order the ISB, GPI, and EPPS. The group 
was tested toward the latter part of the fall 
semester under normal testing conditions 
with no time limit imposed; tests were com- 
pleted in the same sequence. 


RESULTS 


The mean grade-point average for 
the entire scholarship group was 4.97, 
almost a B average on the university 
grading scale (A = 6, B = 5, C = 4, 
D = 3). For purposes of comparison, 
the mean index for all university fresh- 
men was 4.11. 

Of the 72 students tested, those 17 
who were “distinguished students,” re- 
ceiving indices of 5.54 or above, were 
labeled “high achievers”; the 17 stu- 
dents receiving the lowest indices 
(below 4.5) were considered “low 
achievers.” Upon examination, it was 
found that the two groups differed 
greatly in two respects: female-male 
ratio and the proportion of engineer- 
ing students in the two samples. In the 
high achieving group, there were 6 
women and 11 men; in the low achiev- 
ing group, there was 1 woman and 16 
men. Likewise, there was a marked dis- 
crepancy (x? = 3.23, significant be- 
yond the .10 level) between groups in 
the particular schools represented: of 
the 17 high achievers, 8 were from the 
School of Engineering, whereas the low 
achievers included 14 men from engi- 
neering of the total 17 students. 

In order to control school and sex 
differences, the two groups at the 
extremes of the distribution were 
matched on these variables. The low 
achieving group was retained intact, 
with adjustments made in the high 
achieving group. By this procedure, 
the groups obtained each contained 14 
students in engineering; each had 1 fe- 
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male and 16 males, totaling 17 in each 
group. The high achievers’ mean index 
was 5.49, no student falling below 5.2; 
they carried a mean of 19.5 hours of 
credit. The low achievers’ mean index 
was 4.00, no student achieving beyond 
4.5; the mean credit hour load was 
17.8, although the range was 14 to 20 
hours. 

Differences between the two groups 
were tested by Student’s ¢ procedure. 
Preceding each test, a preliminary 
Fax test for homogeneity of variance 
was run. In all cases but two, the hy- 
pothesis of homogeneity was tenable. 
The two exceptions were the ISB NV 
and the discrepancy score (D = maxi- 
mum minus expected grades); the 
Mann-Whitney U test was used to 
evaluate group differences with these 
measures, and in order that all statis- 
tical tests would be comparable, the U 
test was used on all other group com- 
parisons as well. Correlation coeffi- 
cients were used to assess the relation 
of achievement motivation to aca- 
demic achievement for all 72 subjects. 

For need for achievement, neither of 
the standardized inventories discrimi- 
nated between the high and low 
achievers, nor did the ISB NV scoring. 
The sole need for achievement measure 
which yielded the predicted results 
was the discrepancy between the sub- 
ject’s statement of what he thought 
was the maximum performance of 
which he was capable and what he 
thought he was most likely actually to 
achieve (U = 72, p < .01). The results 
with ISB NV, EPPS n Ach, and GPI 
AcR were all in the opposite direction 
to that predicted, although not sig- 
nificantly so. However, when the GPI 
SLA was computed—being the com- 
plement, or inverse, of AcR and SoR 
scores combined—a statistically sig- 
nificant t of 2.05 (p < .05) was found: 
high achievers had significantly higher 
SLA scores (and, consequently lower 
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scores on AcR and SoR combined) 
than did low achievers. The n Nur 
scores on the Edwards’ test reflect the 
same trend, but the groups do not differ 
significantly (¢ = 1.66). 

Correlation coefficients computed 
for the entire 72 subjects indicate no 
significant relationship between actual 
achievement (grades) and need for 
achievement as measured by ISB NV, 
EPPS n Ach, and GPI AcR. However, 
the discrepancy score, which was 
available for 71 subjects, had a signifi- 
cant relationship to actual achieve- 
ment (r = .34, p < .O1). Although 
there were no significant correlations 
for the 72 subjects between need for 
love and affection measures and 
grades, the correlation between the 
need for recognition measure and 
grades (r = —.22) approached signifi- 
cance (r = .231, p < .05). 

All differences between groups on 
the expectancy measures as shown in 
Table 1 were in the predicted direc- 
tion, although one (the discrepancy 
between the subject’s statements of ex- 
pected grades and Minimal Goal) was 
not statistically significant. For the 71 
subjects for whom expectancy state- 
ments were available, there is a signifi- 
cant relationship between actual aca- 
demic achievement and expectancy for 
success, as measured by ISB FM (r = 
.29, p < .05), the stated Minimal Goal 
(r = .68, p < .01), and the stated ex- 
pectancy (r = .68, p < .01) ; no signifi- 
cant relation was found in using the D 
score as an expectancy measure (r = 
.06). In general, then, the students’ ex- 
pectancy statements were better pre- 
dictors of their later achievement than 
were any of the inventory results. 

Table 2 presents the intercorrela- 
tions of the four need for achievement 
measures, all of which are in the pre- 
dicted direction; but the highest is .33, 
and only two of the six are statistically 
significant. Thus, it can be seen that 
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TABLE 1 
CoMPARISON OF H1GH ACHIEVERS AND Low 
ACHIEVERS ON ExpecTANCY MEASURES 








(N = 34) 
Means 
Measures “tee | High ‘ 
Achiev- |Achiev- 
ers ers | 
D score* ).31 | 0.38 | 0.93 
ISB FM> 3.44 | 3.14 |—2.68** 
Stated Minimal | 4.01 | 4.81 |} 5.28** 
Goal 
Stated expectancy 4.32 | 5.18 | 6.37** 





® Expected grade minus minimal grade estimate. 
> Inverse measure. 
**p < 01. 


TABLE 2 
INTERCORRELATIONS OF MEASURES OF 
NEED FOR ACHIEVEMENT* 





| GPI AcR |EPPS nAch| ISB NV 








EPPS n Ach | .33** | | 
ISB NV 1s | .19 
D score” .04 .25* .10 





* N = 72, except in the caseof D scoreswhen N = 71. 
> Maximum grade estimate minus expected grade. 
*p< 05. 

** p< Ol. 


the convergent or concurrent validity 
of these four measures, purportedly of 
the same variable, is extremely low. 

To further explore the question of 
validity of the need measures, coeffi- 
cients of internal consistency of two of 
these measures were computed. (The 
ISB NV score and D score measures 
did not lend themselves to this proce- 
dure.) 

As shown in Table 3, it appears that 
the internal consistency of both the 
GPI and EPPS scales is, although sub- 
stantial, sufficiently low to consider- 
ably attenuate relationships of either 
to any other scale or variable. The in- 
ternal consistency estimates are still, 
however, considerably higher than the 
intercorrelations of the various need 
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scales; “need for achievement” scores, 
then, appear to be considerably in- 
fluenced by specific measurement op- 
erations, format, specific characteris- 
tics of items, and other “apparatus” 
variables of this sort. 

In general, the intercorrelations of 
the various expectancy measures (Ta- 
ble 4) are in the predicted direction. It 
is of particular interest that the FM 
scores obtained from the ISB and the 
Minimal Goal statements were related 
to a statistically significant degree, 
particularly since these two measure- 
ment operations are maximally differ- 
ent. 

The curricula for the fall semester 
of high and low achieving groups were 
analyzed to find whether obtained 


TABLE 3 


CoEFFICIENTS OF INTERNAL CONSISTENCY 
ror GPI anp EPPS NEED FOR 
ACHIEVEMENT MEASURES 














Measures asa Mean” | Sp» 
GPI AcR | 
Subtest I | .67 | 11.58 | 3.68 
Subtest I] 62 | 10.1 | 3.18 
Total AcR |  .64° 21.69 | 5.73 
EPPSnAch | .79 16.85 | 4.55 





® Kuder- Richardson estimate of reliability (Guilford, 
1956, pp. 454-455). 

> The means and standard deviations for each vari- 
able are based on the total raw scores for the 72 subjects. 

© Average coefficient (means of Subtests I and II 
combined) of total AcR test is by Fisher’s z transforma- 
tion (Guilford, 1956, pp. 325-326). 


TABLE 4 


INTERCORRELATIONS OF EXPECTANCY 
MEASURES 











(N = 71) 

Expected | D score®* ISB FM 
D score* 19 
ISB FM .31** | —.08 | 
Minimal Goal | .84** — .36**| .34** 


| 





® Expected grade minus Minimal Goal. 
** p< 01. 


grades were a reflection of course levels 
rather than excellence of performance. 
In general, a sizeable majority of the 
34 students in the extreme achieving 
groups was enrolled in the same cur- 
riculum, i.e., chemistry, mathematics, 
speech, graphics, etc. Further, the high 
achievers were taking more advanced 
courses on the whole, regardless of sub- 
ject matter, than low achievers. 

There was no significant difference 
between the high and low achievers’ 
size of high school graduating class 
(x* = 0.11). The majority of students 
in both groups graduated in classes of 
less than 150. 


DIscussION 


These results have important impli- 
cations, not only for the specific hy- 
potheses under consideration, but for 
the methods and concepts employed. 

Generally, none of the three experi- 
mental hypotheses was fully supported 
by all the measurement devices used. 
The initial hypothesis—that achieve- 
ment motivation is greater for aca- 
demically successful students than for 
unsuccessful students—was supported 
by only one measure. 

In a recent article by Campbell and 
Fiske (1959), convergent validation 
(agreement among independent opera- 
tions for measuring the same variable) 
and discriminant validation (low cor- 
relations between measures of different 
variables) are discussed as aspects of 
the validation process. Although all of 
the need measures in this study were 
correlated in the predicted direction, 
only two of the six tests were signifi- 
cantly correlated. When one considers 
the low convergent validity in addition 
to the failure to predict the criterion, 
enough doubt is cast on all the meas- 
ures that (a) it can reasonably be as- 
sumed that there has not yet been an 
adequate test of the relationship be- 
tween achievement motivation and 
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academic achievement and (b) the 
predictive validity of the measures 
which have been used in previous re- 
search is very much to be questioned. 

If one considers not only these ques- 
tionable validities, but also the low in- 
ternal consistency of the two inven- 
tories, it seems likely that the variable 
of need achievement, however defined, 
is not a unitary trait, i.e., that its gen- 
erality, so to speak, is low. In common 
thinking, need achievement may very 
possibly be specific not only to the area 
of achievement (such as grades, money, 
or athletic skills), but also quite spe- 
cific within each area, e.g., science vs. 
humanities in the academic area. Fur- 
ther, achievement may depend to some 
degree upon many situational varia- 
bles, such as rewarding or punishing 
teachers, etc. If this is so, the need 
achievement tests and the construct 
underlying the tests are of limited util- 
ity. Although speculative, such speci- 
ficities may help to explain why tests 
of generalized “need to achieve” (in 
any area, at any time, and in any situ- 
ation) are not predictive of more spe- 
cific achievement behaviors and why 
research results have been so conflict- 
ing to date. 

In general, the students’ expectancy 
statements—both the Minimal Goal 
and stated expected grade average— 
predicted more successfully their sub- 
sequent achievement than did any of 
the inventory results. It would seem, 
then, that an expectancy variable, sep- 
arate from or interacting with need 
strength, would indeed be a valuable 
addition to motivational theory. In ad- 
dition, the minimal goal seems of more 
importance in a student’s academic 
performance than is achievement need 
value. So far most theoretical concep- 
tions and tests of need achievement 
have thoroughly confounded the mini- 
mal goal (how high the goal aspired to) 
and need value (how strong the aspira- 
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tion), with the results that the con- 
founding itself has confused theoretical 
formulations and perhaps been respon- 
sible for the failures and inconsistency 
in previous research. 

SUMMARY 

The objectives of this study were 
to examine the relationship between 
achievement motivation and academic 
achievement, and to assess the relative 
predictive and convergent validity of 
measures of achievement motivation. 

Subjects were 72 Special Merit (hon- 
orary) Scholarship freshman students, 
relatively homogeneous as to aptitude, 
past achievement, and socioeconomic 
status. Two forced-choice inventories, 
a semiprojective test, and three types 
of students’ grade average estimates 
were used. 

Due to the nature of the sample, the 
conclusions should be restricted chiefly 
to male engineering freshmen. The hy- 
pothesis that high achievers evidence 
greater need for achievement than do 
low achievers was supported by only 
one of four measures. High achievers 
show greater need for social love and 
affection, relative to recognition, than 
do low achievers. Generally, high 
achievers had a greater expectancy for 
academic success and higher minimal 
grade goals than did low achievers. 
These trends were reflected in the en- 
tire group of 72 subjects as well as in 
the comparisons between high and low 
achieving groups. The low positive in- 
tercorrelations among the four need 
for achievement measures suggest im- 
portant inadequacies of the concept 
and/or measures of achievement need. 
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THE USE OF ADVANCE ORGANIZERS IN THE LEARNING 
AND RETENTION OF MEANINGFUL VERBAL MATERIAL 


DAVID P. AUSUBEL 


Bureau of Educational Research, University of Illinois 


The purpose of this study is to test 
the hypothesis that the learning and re- 
tention of unfamiliar but meaningful 
verbal material can be facilitated by 
the advance introduction of relevant 
subsuming concepts (organizers). This 
hypothesis is based on the assumption’ 
that cognitive structure is hierarchi- 
cally organized in terms of highly in- 
clusive concepts under which are sub- 
sumed less inclusive subconcepts and 
informational data (Ausubel, Robbins, 
& Blake, 1957). If this organizational 
principle of progressive differentiation 
of an internalized sphere of knowledge 
does in fact prevail, it is reasonable to 
suppose that new meaningful material 
becomes incorporated into cognitive 
structure in so far as it is subsumable 
under relevant existing concepts. It fol- 
lows, therefore, that the availability 
in cognitive structure of appropriate 
and stable subsumers should enhance 
the incorporability of such material. If 
it is also true that “meaningful for- 
getting” reflects a process of memorial 
reduction, in which the identity of new 
learning material is assimilated by the 
more inclusive meaning of its sub- 
sumers (Ausubel et al., 1957), the 
same availability should also enhance 
retention by decelerating the rate of 
obliterative subsumption. 

In the present study, appropriate 
and relevant subsuming concepts (or- 
ganizers) are deliberately introduced 
prior to the learning of unfamiliar aca- 
demic material, in order to ascertain 
whether learning and retention are en- 
hanced thereby in accordance with the 
theoretical premises advanced above. 
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METHOD 


Subjects 


The experimental population consisted of 
120 senior undergraduate students (78 women 
and 32 men) in four sections of an educa- 
tional psychology course at the University 
of Illinois. All Ss were enrolled in one of 
eight teacher education curricula at the sec- 
ondary school level. Students specializing in 
industrial education and in vocational agri- 
culture were excluded from the study since 
they had received specific instruction in the 
topic covered by the learning passage. The 
experiment was conducted separately in each 
section as a required laboratory exercise and 
was performed during regularly scheduled 
class hours. In order to maximize ego-involve- 
ment, Ss were informed that after the data 
were processed their individual scores, as well 
as the class results, would be reported to 
them. 


Learning Passage and Test of Reten- 
tion 

The learning material used in this study 
was a specially prepared 2,500-word passage’ 
dealing with the metallurgical properties of 
plain carbon steel. Emphasis was placed on 
such basic principles as the relationship be- 
tween metallic grain structure, on the one 
hand, and temperature, carbon content, and 
rate of cooling, on the other. Important fac- 
tual information (e.g., critical temperatures), 
however, was also included, and basic prin- 
ciples were also applied to such technological 
processes as heat treatment and tempering. 

The metallurgical topic was chosen on the 
basis of being generally unfamiliar to under- 
graduates in liberal arts and sciences (ie., 
not ordinarly included in chemistry courses), 
but still sufficiently elementary to be both 
comprehensible and interesting to novices 
with no prior background in the field. The 
criterion of unfamiliarity was especially cru- 





* Appreciation is expressed to Robert M. 
Tomlinson for assistance in the preparation 
of the learning passage. 
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cial because the purpose of the study was to 
ascertain whether advance organizers could 
facilitate retention in areas of knowledge new 
to learners. By using unfamiliar material it 
was also possible to ensure that all Ss started 
from approximately the same baseline in 
learning the material. Empirical proof of un- 
familiarity was sought, therefore, by admin- 
istering the retention test on the steel passage 
to a comparable group of naive Ss who had 
not studied the material; but although this 
latter group of Ss made scores which, on the 
average, were only slightly and not signifi- 
cantly better than chance, it was evident 
from later analysis of the experimental data 
that scores earned by Ss who had studied the 
passage were related to both sex and field of 
specialization. Male students and majors in 
science and art were better able to learn and 
retain the steel material than were female 
students and majors in English, foreign lan- 
guages, music, and the social sciences. Fence, 
the criterion of unfamiliarity was not -om- 
pletely satisfied, in as much as these diller- 
ences undoubtedly reflected, in part, var- 
iability in relevant incidental experience 
influencing the learnability of the material. 
Knowledge of the steel passage was tested 
by a 36-item multiple-choice examination 
with a corrected split-half reliability of .79. 
Test questions covered principles, facts, and 
applications, and were selected by an item 
analysis procedure from a larger population 
of items. Scores on the test showed a satis- 
factory range of variability and were distrib- 
uted normally. Since it was intended as a 
power test, no time limit was imposed. 


Procedure 


It was first necessary to equate experimen- 
tal and control groups on the basis of ability 
to learn an unfamiliar scientific passage of 
comparable difficulty. The passage used for 
this purpose was concerned with the endo- 
crinology of human pubescence and was ap- 
proximately 1,800 words long. Ss were given 
20 minutes to read and study this material, 
and were tested immediately thereafter by 
a 26-item multiple-choice test with a cor- 
rected split-half reliability of .78. (The un- 
familiarity of the material had been pre- 
viously ascertained by administering the test 
to a comparable group of naive Ss who had 
not studied the passage, and obtaining a 
mean score only slightly and not significantly 
greater than chance.) Test scores on the pu- 
bescence passage were normally distributed 
and correlated 64 on a product-moment basis 
with test scores on the steel passage. F tests 
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were performed on the variance ratios of the 
pubescence material test scores for all pos- 
sible combinations of the four sections, and 
none approached significance at the .05 level 
of confidence. It was considered justifiable, 
therefore, to treat the retention scores of ex- 
perimental and control groups on the steel 
passage as if derived, respectively, from one 
large class rather than from four separate 
sections. 

Ss in each of the four sections were 
matched on the basis of test scores on the 
pubescence material and assigned to experi- 
mental and control groups. Experimental and 
control treatments were then administered 
simultaneously to experimental and control 
Ss, respectively, within each section. This 
procedure was possible because the two treat- 
ments consisted of studying identical appear- 
ing introductory passages differing only in 
content. The use of this procedure also pro- 
vided the important methodological advan- 
tage of holding instructor, class, and situa- 
tional variables constant for both groups. 
Each introductory passage of approximately 
500 words was studied twice, 5 minutes each 
time, by the appropriate group of Ss. The two 
occasions were 48 hours and immediately be- 
fore exposure to the main learning passage. 
— The experimental introductory passage 
contained background material for the learn- 
ing passage which was presented at a much 
higher level of abstraction, generality, and 
inclusiveness than the latter passage itself. 
It was designed to serve as an organizing or 
anchoring focus for the steel material and to 
relate it to existing cognitive structure/ Prin- 
cipal emphasis was placed, therefore, on the 
major similarities and differences between 
metals and alloys, their respective advan- 
tages and limitations, and the reasons for 
making and using alloys. Although this pas- 
sage provided Ss in the experimental group 
with relevang background concepts of a gen- 
eral nature,At was carefully designed not to 
contain specific information that would con- 
fer a direct advantage in answering any of 
the questions on the steel test; This latter 
criterion was tested empirically amd shown to 
be warranted when a comparable group of 
Ss made only a slightly better than chance 
mean score on the steel test after studying 
the introductory passage alone. 

The control introductory passage, on the 
other hand, consisted of such historically rel- 
evant background material as the historical 
evolution of the methods used in processing 
iron and steel. This type of introductory 
material is traditionally included in most 
textbooks on metallurgy and is presumably 
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intended to enhance student interest. In con- 
trast to the introductory passage given to the 
experimental group/ it contained no concep- 
tual material that ‘could serve as an idea- 
tional framework for organizing the particu- 
lar substantive body of more detailed ideas, 
facts, and relationships in the learning pas- 
sage. 

It was methodologically necessary to pro- 
vide this control treatment in order that any 
obtained difference between experimental 
and control groups could be attributed to the 
particular nature of the experimental intro- 
ductory passage (i.e., to its organizing prop- 
erties) rather than to its presence per se. 

Both groups studied the steel passage for 
35 minutes and took the multiple-choice steel 
test 3 days later. Since it was evident from a 
comparison of test scores on the steel and 
pubescence passages that scores on the steel 
test were related to Ss’ sex and major field, 
it was necessary to hold these latter factors 
(as well as pubescence test scores) constant. 
Hence, it was no longer possible to use the 
originally matched pairs of Ss within each 
section. Sufficient Ss were also not available 
to rematch individual pairs of Ss on all three 
variables. By matching experimental and 
control Ss across sections, however, it was 
possible to equate two groups of 40 Ss each 
for sex, pubescence scores, and field of spe- 
cialization. The crossing of sectional lines in 
this rematching procedure was justifiable in 
view of the intersectional homogeneity of 
variance. 


RESULTS AND DISCUSSION 


The distribution of steel test scores 
for both experimental and control 
groups did not deviate significantly 
from the normal.? The mean steel test 
score of the experimental group was 
16.7, as compared to 14.1 for the con- 
trol group and a mean chance score of 
7.2 (one-fifth of 36). The standard de- 
viations of the two groups were 5.8 and 
5.4, respectively. The difference be- 
tween the means® of the experimental 
and control groups was almost signifi- 


* Appreciation is expressed to Pearl Ausu- 
bel for assistance in the processing of the 
data. 

* The standard error of the difference for 
equated groups was calculated according to 
a method described by Edwards (1954, pp. 
282-288). 
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TABLE I 
RETENTION Test Scores or EXPpEri- 
MENTAL AND CONTROL GROUPS 
ON LEARNING PASSAGE 








. | 
Type of Mean SD 


Group Introduction 





8 
Control 4 


Experimental Substantive | 16.7 | 5. 
Historical | 14.1]/ 5. 





Note.—Chance Score on the multiple-choice test of 
36 items is 7.2. The difference between the means in this 
table is reliable at between the .05 and .01 level of confi- 
dence. 


cant at the .01 level for a one-tailed 
test. 

The obtained difference in retention 
between experimental and control 
groups, although statistically signifi- 
cant, would undoubtedly have been 
even greater if the learning passage 
used for matching purposes had been 
in the same subject matter field as the 
steel material (i.e., if the relationship 
between the two sets of scores were 
higher than that indicated by the cor- 
relation of .64 between the steel and 
pubescence scores). Another experi- 
mental condition probably detracting 
from the difference between the two 
groups was the fact that the steel ma- 
terial was not completely unfamiliar 
to many Ss. Because of some prior gen- 
eral familiarity with the contents of 
the steel passage, many Ss already pos- 
sessed relevant and stable subsuming 
concepts. These obviously rendered less 
significant the potential learning ad- 
vantages conferable by advance or- 
ganizers. 

It could be argued, of course, that 
exposure to the experimental introduc- 
tion constituted in effect a partial sub- 
stantive equivalent of an additional 
learning trial. Actually, however, any 
substantive repetition was at most very 
indirect, since the introductory passage 
consisted of much more inclusive and 
general background material than was 
contained in the learning task itself, 
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and also provided no direct advantage 
in answering the test items. Further- 
more, according to behavioristic (in- 
terference) theory, prior exposure to 
similar but not identical learning ma- 
terial induces proactive inhibition 
rather than facilitation. 

Advance organizers probably facili- 
tate the incorporability and longevity 
of meaningful verbal material in two 
different ways.°¥irst, they explicitly 
draw upon and mobilize whatever rele- 
vant subsuming concepts are already 
established in the learner’s cognitive 
structure and make them part of the 
subsuming entity. Thus, not only is the 
new material rendered more familiar 
and meaningful, but the most relevant 
ideational antecedents are also selected 
and utilized in integrated fashion-Sec- 
ond, advance organizers at an appro- 
priate level of inclusiveness provide 
optimal anchorage. This promotes both 
initial incorporation and later resist- 
ance to obliterative subsumption. 

The appropriate level of inclusive- 
ness may be defined as that level which 
is as proximate as possible to the de- 
gree of conceptualization of the learn- 
ing task—relative, of course, to the 
existing degree of differentiation of 
the subject as a whole in the learner’s 
cognitive background. Thus, the more 
unfamiliar the learning material (i.e., 
the more undifferentiated the learner’s 
background of relevant concepts), the 
more inclusive or highly generalized 
the subsumers must be in order to be 
proximate. If appropriately relevant 
and proximate subsuming concepts are 
not available, the learner tends to use 
the most proximate and relevant ones 
that are. But since it is highly improb- 
able, however, that we can count on the 
spontaneous availability of the most 
relevant and proximate subsuming 
concepts, the most dependable way of 
facilitating retention is to introduce 


DAVID P. AUSUBEL 








the appropriate subsumers and make 
them part of cognitive structure prior 
to the actual presentation of the learn- 
ing task. The introduced subsumers 
thus become advance organizers or 
anchoring foci for the reception of new 
material. 

Even though this principle seems 
rather self-evident it is rarely followed 
in actual teaching procedures or in the 
organization of most textbooks. The 
more typical practice is to segregate 
topically homogeneous materials into 
separate chapters, and to present them 
throughout at a uniform level of con- 
ceptualization in accordance with a 
logical outline of subject matter organ- 
ization. This practice, of course, al- 
though logically sound is psychologi- 
cally incongruous with the postulated 
process whereby meaningful learning 
occurs, i.e.j with the hierarchical or- 
ganization of cognitive structure in 
terms of progressive gradations of in- 
clusiveness, and with the mechanism of 
accretion through a process of progres- 
sive differentiation of an undifferen- 
tiated field.; Thus, in most instances, 
students are required to learn the de- 
tails of new and unfamiliar disciplines 
before they have acquired an adequate 
body of relevant subsumers at an ap- 
propriate level of inclusiveness. 

As a result, both students and teach- 
ers are often coerced into treating 
meaningful materials as if they were 
rote in character, and students conse- 
quently experience unnecessary diffi- 
culty and reduced success in both 
learning and retention. The teaching 
of mathematics and science, for exam- 
ple, still relies heavily on rote learning 
of formulas and procedural steps, on 
recogntion of stereotyped “type prob- 
lems,” and on mechanical manipula- 
tion of symbols. In the absence of clear 
and stable concepts which can serve as 
anchoring points and organizing foci 
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for the incorporation of new meaning- 
ful material, students are trapped in a 
morass of confusion and have little 
choice but to rotely memorize learning 
tasks for examination purposes. The 
traditional historical introduction to 
new and primarily nonhistorical, sub- 
ject matter concepts possibly enhances 
student interest, but lacks the neces- 
sary substantive content to serve this 
organizing function (see examples un- 
der Procedure section above). 

The pedagogic value of advance or- 
ganizers obviously depends in part 
upon how well organized the learning 
material itself is. If it contains built-in 
organizers and proceeds from regions 
of lesser to greater differentiation 
(higher to lower inclusiveness) , rather 
than in the manner of the typical text- 
book or lecture presentation, much of 
the potential benefit derivable from ad- 
vance organizers will not be actualized. 
Regardless of how well-organized 
learning material is, however, it is hy- 
poth. zed that learning and retention 
can still be facilitated by the use of 
advance organizers at an appropriate 
level of inclusiveness. Such organizers 
are available from the very beginning 
of the learning task, and their integra- 
tive properties are also much more sa- 
lient than when introduced concur- 
rently with the learning material. 


SuMMARY AND CONCLUSIONS 


An empirical test was made of the 
hypothesis that the learning and re- 
tention of unfamiliar but meaningful 
verbal material could be facilitated by 
the advance introduction of relevant 
subsuming concepts (organizers). Ex- 
perimental and control groups of 40 
undergraduate Ss each were equated 
on the basis of sex, field of specializa- 
tien, and ability to learn unfamiliar 
scientific material. The learning task 
consisted of a 2,500-word passage of 
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empirically demonstrated unfamiliar- 
ity, dealing with the metallurgical 
properties of steel. On two separate oc- 
casions, 48 hours and immediately 
prior to contact with the learning task, 
experimental Ss studied a 500-word in- 
troductory passage containing sub- 
stantive background material of a con- 
ceptual nature presented at a much 
higher level of generality, abstraction, 
and inclusiveness than the steel mate- 
rial itself. This passage was empirically 
shown to contain no information that 
could be directly helpful in answering 
the test items on the steel passage. Con- 
trol Ss similarly studied a traditional 
type of historical introduction of iden- 
tical length. Retention of the learning 
material was tested 3 days later by 
means of a multiple-choice test. Com- 
parison of the mean retention scores of 
the experimental and control groups 
unequivocably supported the hypothe- 
sis. 

The facilitating influence of advance 
organizers on the incorporability and 
longevity of meaningful learning ma- 
terial was attributed to two factors: 
(a) the selective mobilization of the 
most relevant existing concepts in the 
learner’s cognitive structure for inte- 
grative use as part of the subsuming 
focus for the new learning task, thereby 
increasing the task’s familiarity and 
meaningfulness; and (b) the provision 
of optimal anchorage for the learning 
material in the form of relevant and 
appropriate subsuming concepts at a 
proximate level of inclusiveness. 

The suggestion was offered that the 
greater use of appropriate (substantive 
rather than historical) advance or- 
ganizers in the teaching of meaningful 
verbal material could lead to more ef- 
fective retention. This procedure would 
also render unnecessary much of the 
rote memorization to which students 
resort because they are required to 
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learn the details of a discipline before 
having available a sufficient number of 
key subsuming concepts. 


REFERENCES 


AvususeL, D. P., Rossimvs, Luuun, C., & 
Buakg, E., Jn. Retroactive inhibition and 


facilitation in the learning of school ma- 
terials. J. educ. Psychol., 1957, 48, 334- 
343. 

Epwarps, A. L. Statistical methods for the 
behavioral sciences. New York: Rine- 
hart, 1954. 


(Received March 14, 1960) 


























Journal of Educational Psychol 
1960, Vol. 51, No. 5, 273-276 


STABILITY AND CORRELATES OF JUDGED CREATIVITY 
IN FIFTH GRADE WRITINGS 


NORMAN E. WALLEN aynp GILBERT M. STEVENSON 
University of Utah 


Proceedings of conferences on crea- 
tivity (Taylor, 1959) generate the hy- 
pothesis that “creativity” may soon be 
added to “Achievement, Anxiety, and 
Authoritarianism” as the favorite var- 
iables of psychologists. As has been 
frequently pointed out, however, this 
variable poses rather severe problems 
of definition and measurement. 

This paper is concerned only with 
creativity in writing as judged by fifth 
grade teachers. If creative abilities can 
be identified at an early age, one ap- 
proach to measuring such abilities is 
to have sample writings evaluated by 
persons with wide experience with such 
material. The first question which 
arises is whether teachers can agree as 
to the creativity of a given sample of 
writing. Studies by Wrightstone (1938) 
and Mary Francis Assisi (1950) indi- 
cate that such agreement can be ob- 
tained. In the event that judges can 
agree, the next question is the extent 
to which such ability as judged is con- 
stant over a period of time. In addition 
to investigating these two questions, 
we were interested in the relation of 
this variable to the dimensions of intel- 
ligence, academic achievement in the 
more traditional sense, and social ad- 
justment. 


PROCEDURE AND RESULTS 


Five teachers in a small town ele- 
mentary school comprised the group 
of judges.' The student sample (N = 


* Originally, a sixth judge who was an up- 
per division undergraduate student in educa- 
tion was included. The agreement between 
this judge and the others remained low, how- 
ever and hence his ratings were excluded. 
Several explanations of his discrepant judg- 
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63) consisted of the two fifth grades in 
the same school. There would appear to 
be no atypical characteristics of the 
students. The mean and standard de- 
viation on the California Mental Ma- 
turity Test are 103 and 13.7, respec- 
tively. Neither do the five judges ap- 
pear in any major respect atypical of 
elementary school teachers with the 
possible exception of two who were the 
teachers of the two classes comprising 


- the student sample. That they may be 


somewhat atypical is suggested by the 
fact that both teachers place consi‘t«r- 
able emphasis on establishing what is 
considered to be a creative atmosphere 
as described by Wilson (1958) : 


Writers on the subject are in pretty gen- 
eral agreement that the environmental condi- 
tions which foster creativity are those which 
encourage independent thought and which 
are permissive of new ideas. They seem to be 
in agreement that conditions which produce 
fear of criticism are likely to inhibit the in- 
dividual’s expression of his creative ideas (p. 
117). 


During the year, the students had 
considerable experience in writing with 
emphasis on the expression of original 
ideas and feelings. Class discussion of 
their compositions was common. 

Four sets of compositions each on a 
different topic were utilized for pur- 
poses of the study. All were obtained 
during the spring term at 2- to 4-week 
intervals. The first set was used as a 
preliminary test of judge agreement 
and was written on one of the first days 
of spring, the instructions to students 
being to express their feelings about the 





ments are possible; we favor the notion that 
his lack of experience with childrens’ writings 
is primarily responsible. 
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arrival of spring. The second set of 
compositions consisted of inscriptions 
for Mothers’ Day cards to be taken 
home. The third set was structured as 
reflecting impressions on a rainy day 
and the final set was intended to ex- 
press feelings as to the end of the school 
year. No time limit was set but most 
papers were finished within 45 minutes, 
though a few took as long as 2 hours. 
The typical length of the compositions 
was 80 to 100 words. 

Each judge first assigned a rating to 
each paper in the first set according to 
the following definition and instruc- 
tions: 

Creativity is the expression of informa- 
tion, ideas and feelings colored by original 
thoughts and inspired by the inner urge of 
an individual to express himself. With this 
definition in mind, judge these examples on 
how well the individual in a creative way ex- 
presses his ideas and feelings based on the 
title and theme of the writing. Judge on a 
five-point system—five being the most crea- 
tive and one the least. 


We recognize that this definition is 
inelegant and imprecise, but it does 
seem to convey the elements of crea- 
tivity as expressed by many writers. 
Identification was, of course, removed 
from the papers. 

The intercorrelations of judges by 
pairs initially ranged from .57 to .68. 
At this point a discussion was held with 
the judges, at which time individual 
compositions were examined and points 
of disagreement analyzed. The princi- 
pal decisions reached were that spell- 
ing, neatness, vocabulary, and length 
should not be weighted in judging nor 
should poetry be considered more crea- 
tive than prose. Subsequently the same 
compositions were judged again with 
resulting interjudge correlations in the 
high .70’s and .80’s. It was decided that 
sufficient agreement had been attained 
to warrant proceeding further. This 
first set of compositions was set aside 
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and played no further part in the study. 

The next step was to have each judge 
rate the other three sets of composi- 
tions. From these ratings, a score for 
each set was obtained for each student 
which was the mean of five ratings re- 
ceived. It was then possible to deter- 
mine the extent to which this composite 
rating was consistent over the three 
sets of writings. The correlations be- 
tween Sets 1 and 2, between Sets 1 and 
3, and between Sets 2 and 3 were .81, 
.86, and .86, respectively, demonstrat- 
ing a high degree of stability of crea- 
tivity as measured by the composite 
score. 

Since we now had evidence that 
there was consistency in compositions 
from set to set, we proceeded to derive 
a second score by taking the mean of 
the three ratings given by each judge 
to each student. This measure was 
taken as the measure of creativity 
demonstrated by the student, as seen 
by a given judge. In effect, this pro- 
vided a measure based on three writ- 
ings rather than one as had been the 
case with our original data on judge 
agreement. To determine the agree- 
ment of judges based on this score, we 
computed the intercorrelations among 
pairs of judges. The 10 correlations 
ranged from .70 to .93 with a mean of 
81. Although there is a spread in cor- 
relations, we feel that considerable 
agreement is indicated. 

The foregoing analyses appeared to 
warrant the computation of a single 
creativity score for each student which 
was the mean of all 15 ratings received 
by him, i.e., three ratings from each of 
five judges. This measure provides our 
operational definition of creativity in 
writing and was related to each of the 
following measures, all obtained dur- 
ing the spring term: IQ based on the 
California Mental Maturity Test, 
Short Form; Grade-equivalent scores 
on the Reading, Arithmetic, and Lan- 
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guage subtests of the California 
Achievement Tests, Intermediate level ; 
the mean of teacher grades given in 
reading, writing, spelling, arithmetic, 
social studies, and science during the 
year; a social adjustment score con- 
sisting of the number of responses to a 
28-item incomplete sentences test 
which were judged to indicate a prob- 
lem; the number of “large problem” 
responses to the SRA Junior Inventory 
—a measure of social adjustment; a 
rating by the child’s teacher on a five- 
point seale of social adjustment—a 
high score indicated superior adjust- 
ment as viewed by the teacher; and the 
mean rating received on the Ohio So- 
cial Acceptance Scale—a high score in- 
dicated that the rest of the class tended 
to rank the student as well-liked. 

The correlations between the com- 
posite creativity measure and each of 
the other measures are shown in Table 
1 along with descriptive data on each 
measure. The data seem quite consist- 
ent and present a coherent picture. In 
short, there is a substantial relationship 
between the creativity score and each 
of the ability measures (the values 
ranging from .57 to .72) and a smaiier 
but significant relationship with social 
adjustment indicating a tendency for 
the more creative to be better adjusted 
socially.” 


DIscussIoNn 


One issue we must deal with is the 
possibility of contamination in the 
data. To what extent may a given rat- 
ing reflect knowledge of the identity of 
the writer? At most, one of the five 
judges would have had the possibility 
of identifying the writer through hand- 
writing, phraseology, etc., since the two 
class teachers served as judges. We feel 
this possibility is slight, though possi- 


*Note that the negative correlations 
merely reflect the fact that on these measures 
a high score reflects poorer adjustment. 
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TABLE 1 


DaTa ON ABILITY AND ADJUSTMENT MEas- 
URES AND THEIR RELATION TO 
CREATIVITY SCORES 


Standard 


Variable Mean | Devia- Correla- 
tion tion 
Sentence Comp. 5.02} 2.31 | —.31 
Test 
SRA Junior Inven- | 17.74) 11.39 | —.23 
tory 
Ohio Rating Scale 2.76 61 3 
Social Adj. Teacher | 3.03 .87 45 
Cal. Ach. Reading 5.04) 1.22 71 
Cal. Ach. Arith- | 5.40 .66 .66 
metic 
Cal. Ach. Language | 5.65 .89 72 
Sch. Grade Average | 3.10 .61 .66 


IQ Cal. Men. Mat. (103.58) 13.56 | .57 





* Values required for significance at the .05 and .01 
levels are .25 and .33, respectively. 


ble, and our data on judge agreement 
provide contrary evidence since there 
is good interjudge agreement. The 
question arises again regarding the sta- 
bility data. Is it probable that consist- 
ency in rating reflects recall of the 
rating previously assigned to the same 
handwriting, similar phraseology, etc.? 
Although we cannot rule out this pos- 
sibility, it seems unlikely that judges 
would be able to recall discriminations 
of this magnitude while dealing with 
63 different students and selections of 
around 100 words. 

As to the relationships between crea- 
tivity and the other dimensions, we 
may say that they support the notion 
that creativity in writing does not exist 
in a vacuum but is rather highly re- 
lated to general intellectual and aca- 
demic skills. It is rather surprising to 
find the measure of general ability cor- 
relating to a lesser degree with creativ- 
ity score than the measures of specific 
academic skills, though the difference is 
not statistically significant. It may be 
that creativity in writing is more heav- 
ily dependent on such specific skills 
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than many have thought. An alterna- 
tive explanation is that our judges were 
in fact rating “scholastic conformity” 
rather than creativity though we do not 
think this to be the case. 

The relationships between creativity 
and our measures of social adjustment 
support the notion that creativity can 
flourish in the child who gets along 
well with others. They offer little sup- 
port for the notion that creativity is 
related to social ostracism, frustration, 
etc. In looking at our data clinically, 
we found a tendency for there to be 
more variability on the measures of so- 
cial adjustment at the low end of the 
creativity continuum, the nine chil- 
dren having the highest creativity 
scores being rated high in social ad- 
justment by teachers and reporting few 
problems on the SRA inventory, with- 
out exception. Of the nine, one received 
a low score on the Ohio rating scale and 
a high score on the sentence completion 
test suggesting some difficulties in this 
area, and another student scored in the 
middle of the distribution on these 
measures. The other seven all scored 
high in the “favorable” direction. 


SUMMARY 


After preliminary training, five ele- 
mentary school teachers were able to 


achieve an acceptable level of agree- 
ment as to the amount of creativity 
demonstrated in the compositions of 
each of 63 fifth grade students. Com- 
posite scores based on all five ratings 
of each composition were shown to be 
very stable over three sets of writings 
obtained at 2- and 4-week intervals. 
An overall score for each child based 
on a total of 15 ratings was found to be 
quite highly related to measures of gen- 
eral intelligence and academic profi- 
ciency, and moderately related to 
measures of social adjustment. 
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THE CONTRIBUTION OF THE LECTURE TO 
COLLEGE TEACHING! 
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Faced with the prospect of a growing 
number of students, educators have 
become increasingly concerned with 
the manner in which the instructor 
| can most effectively use the time which 
| he devotes to his students. Consider- 
able research has been directed toward 
| finding the best method for an in- 
structor to use. However, the con- 
clusions reached are, at best, tentative. 

A general paradigm followed by 
researchers in studying teaching meth- 
ods has been one in which results of 
teaching by lecturers for a certain 
number of hours per week are used as 
a criterion. Results obtained with some 
other method, e.g., discussion, are 
compared to the usual lecture proce- 
dure. As a result of these studies the 
lecture method has been condemned 
and supported. For reviews of the re- 
search on teaching methods see Birney 


and McKeachie (1955), Dashiell 
(1935), Wispe (1953), and Wolfe 
(1942). 


Greene (1928) and Corey (1934) 
used simple experimental designs which 
were restricted to a comparison of the 
effectiveness of one or two lectures 
with a similar number of reading 
periods. Their results were not gen- 
eralized to the effects of lectures for 
an entire course of instruction. Parsons, 
Ketcham, and Beech (1958) compared 
one lecture per week with a condition 
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in which small groups of students met 
to discuss the assigned readings once 
a week. The latter study was more 
concerned with the factor of student- 
instructor contact than it was in the 
influence of the lecture. These authors 
compared lecture and reading with 
discussion and reading. 

The assumption underlying virtually 
all of the research in this area is that 
something is being taught by the lec- 
ture method that the student cannot 
get for himself by reading the textbook 
for the course. The present study was 
planned as a test of this assumption. 
The value of the lecture was assessed 
by comparing it to a condition in which 
students received no lectures. 


METHOD 


General 


The specific question of this study was: 
Do students obtain as much knowledge 
from the text (Morgan, 1956) as they do 
from lectures and text? The best research 
design would seem to be one in which one 
group reads the text and attends lectures, 
while the other group does not attend class 
but only reads the text assignments. This 
‘best’? methodology would present two 
problems. 

1. Any difference 
groups might result 
reading assignments 
coming to class. 

2. Preselection by the students might 
occur if they knew that two different teach- 
ing methods were to be used. 

To resolve the first of these difficulties, 
the general procedure employed in conduct- 
ing this study was to teach an introductory 
course in psychology in two different ways. 
Four sections, comprising the control group, 
were taught in the conventional manner, 
with a formal lecture presented at each class 
meeting (four per week), exclusive of review 


found between the 
from not doing the 
because one is not 








and examination periods. Four additional 
sections, comprising the experimental 
group, received no formal lectures, and met 
only once a week, at which time the instruc- 
tor’s task was limited to answering specific 
questions raised by the students concerning 
the reading assignments. No discussion was 
permitted. Thus, at least partial control of 
the students’ reading was maintained and 
students did have contact with the instruc- 
tor. 

To resolve the second difficulty, students 
registered for the various sections of the 
course in the conventional registration pro- 
cedures common to most colleges and uni- 
versities. None of the students was aware 
that certain sections of the course would be 
taugut in a unique manner until registration 
was completed and classes had met for the 
second t*'me. 


Instructors 


Four Graduate Teaching Instructors 
participated in the study. Each assumed 
full responsibility for two sections of the 
general introductory course in Psychology, 
a one term, four credit course normally 
meeting four times weekly. All instructors 
had taught this course the preceding term. 

Each instructor taught one control sec- 
tion and one experimental section. To con- 
trol for possible influences of time of day 
taught, control and experimental sections 
were balanced between morning and after- 
noon. 


Subjects‘ 


The subjects used in this study were 144 
male and female undergraduate students, 
ranging from sophomores to seniors, en- 
rolled in introductory psychology during 
the winter term of the academic year 1958- 
1959. The mean number of students per class 
was 38 and the range of students per class 
was 28 to 44. From each class 18 students 
were selected as subjects so that 6 had a 
grade point average above 2.5, 6 were in the 
grade point average range from 2.49 to 2.15, 
and 6 had a grade point average of 2.14 or 
below (C = 2.0). Selection of subjects within 
any level was random. 





‘ The term ‘‘subject”’ is used to designate 
those whose scores on examinations were 
used in the analyses of variance. The term 
“‘student”’ is used to refer to those who were 
subjects and all others who received instruc- 
tion during the period of the study. 
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Procedure 


At the first meeting of each section a 
course outline, common to all sections, was 
distributed to each student. The outline in- 
cluded weekly textbook assignments and 
dates of examinations. The remainder of the 
first meeting was used for completion of 
various routine administrative tasks. 

During the second meeting of each class 
the instructors made verbal statements 
emphasizing the following points: that the 
Psychology Department was interested in 
trying new methods for teaching the course, 
that the class would meet (a) four times per 
week or (b) once a week for a question-an- 
swer period as appropriate; that three one- 
hour examinations would be scheduled dur- 
ing the term; and the final grades would be 
assigned on the basis of performance in the 
particular class without reference to classes 
which were being taught in different ways. 
It was felt that giving almost identical in- 
structions to both groups would decrease 
any effect on the experimental treatments 
due to change in the course once the stu- 
dents had registered. 

At no point during the term did the in- 
structors state or intimate that an experi- 
ment, as such, was being conducted. Every 
effort was made to avoid the possibility of 
the students’ considering themselves as 
“guinea pigs.’’ Office hours were held weekly 
by each instructor for any student seeking 
assistance with the course material. The 
order and rate of material covered was the 
same for both the experimental and control 
sections. No attempt was made to give the 
same lectures in the same style to the var- 
ious control sections. Each instructor con- 
ducted his lecture classes in his own way. 


Criteria 

The primary criteria used in comparing 
the control and experimental sections were 
a series of four objective examinations. 
Three examinations were given during the 
term, and consisted in each case of 50, five- 
alternative, multiple-choice questions. 
Each examination covered five or six chap- 
ters of the textbook (about 150 pages). To 
prepare the examinations, each instructor 
submitted anonymously five questions rela- 
tive to each chapter assigned for the test. 
All questions were collected together by 
chapters and ranked in order of desirability 
by each instructor. Those questions receiv- 
ing the highest overall preference were in- 
cluded in the examination after allowance 
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had been made for duplication and ade- 
quacy of coverage. 

The final examination was a departmen- 
tal, 99-item, five-alternative, multiple- 
choice test administered to all sections of the 
course including those sections not in the 
study. 

The secondary criterion used in the study 
was an evaluation form completed anony- 
mously by all students at the end of the 
term. The form consisted of 41 true-false 
statements about the course and the in- 
structor. 


RESULTS 


An analysis of variance design was 
used to test for differences between 
the two methods. Criteria measures 
used in the analysis were the subjects’ 
test scores. A separate analysis was 
performed for each of the three one- 
hour examinations and the final ex- 
amination. A chi square analysis was 
used to test for differences on the 
course rating inventory. 

The analysis of variance design used 
was taken from Snedecor (1956, 
p. 364). This design deals with two 
fixed effects and one random effect. 
The instructor was considered a ran- 
dom effect. Treatments (lecture and 
textbook, and textbook) and levels 
were considered fixed effects. 

Two problems were found which 
were important in the analysis of the 
data. First, the size of the classes used 
in this study varied, N = 28 to N = 
44. Second, the classes were distributed 
differently with respect to grade point 
average. Both of these problems were 
dealt with by incorporation of levels 
into the experimental design. 

The levels were selected that 
roughly one-third of all the students 
were in the upper level, one-third were 
in the middle level, and one-third were 
in the lower level. In two classes only 
six students were in a particular level 
(one high level and one low level). 
Therefore, a sample of six subjects 
was randomly drawn from each class 
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TABLE 1 


ANALYSIS OF THE GRADE PoINT AVERAGES 
oF SUBJECTS 


Source df 


Treatment 


| 1 15 1.67 

Levels | 2 | 14.78 | 295.60** 
Instructors 3 | .07 1.40 
o- a” 2 .06 .86 
Tx I 3 .09 | 1.80 
Ix L 6 | .04 .80 
TXLxI 6 .07 1.40 
Error } 120 | .05 

Total | 143 | 

**p < .0l 


at each level, giving an N of 18 for 
each class and a total of 144. Further 
investigation showed that different 
selections of levels did not increase the 
total N. The identity of the specific 
subjects who were used in the final 
analysis was unknown to the instruc- 
tors until final grades had been as- 
signed. 

The first analysis was done using 
grade point averages of the students 
as values (see Table 1). This was done 
to determine if the subjects were ran- 
domly distributed with respect to their 
grade point averages among the in- 
structors and the treatments. None of 
the effects the interactions 


or was 
found to be significant at p < .10 
except the levels effect which was 


built into the design. The levels effect 
was significant at p < .01. 

Table 2 presents the results of the 
analyses of variance performed on the 
three one-hour examinations. None of 
the F values, except those reported 
in the table, was significant beyond 
the .05 level. The levels effect was 
beyond the .05 level in both the first 
test analysis and the third test analysis, 
but was not significant at the .05 level 
for the second test. However, the F 
value for the treatment effect in the 
second test was in the direction of the 
results obtained for Tests 1 and 3. 
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TABLE 2 
ANALYSIS OF RESULTS FOR THE ONE-HovurR TEsTs 
Test 1 Test 2 Test 3 
Source df 
MS F MS P MS F 
Treatment 1 506.25 | 13.86* 200.69 | 3.86 510.01 | 20.66** 
Levels 2 715.88 | 26.49** | 363.17 | 12.73** | 488.67 | 13.33* 
Instructors 3 28 .82 57.19 30.90 
TXL 2 .44 46.72 54.88 
Tae 3 | 36.53 51.94 22.54 
Ix L 6 27.02 | 17.15 36.65 
TXLxI 6 | 58.14 | 12.09 | 10.56 
Error 120 | 22.10 | 28.52 | 24.68 
Total 143 SOI 
*p< 0. 
**p < Ol. 
TABLE 3 in Table 3. An idea of the practical 


MEAN PERFORMANCE OF THE LECTURE AND 
THE NONLECTURE GROUPS 








| 
Nonlecture | 


Lecture Group group 














| Mf | sp| M | SD | SD |sDu 
Maes Re Rees anes abet 

| | | | 
Test 1 | 34.61) 5.57| 30.86) 5.83) 4.53) .38 
2 | 37.69) 5.36) 35.33) 6.17) 5.34) .45 
3 | 38.35) 5.39) 34.58) 5.85) 4.97) .41 
Final § | 74.32/10.92) 69.78 7.98) 7.79) .65 





TABLE 4 


ANALYSIS OF RESULTS FOR THE 
FinaL EXAMINATION 








Source df Ms | F 











| 

Treatment 1 | 742.56 | 12.23** 
Levels 2 1195.30 7.00* 
Instructors 3 | 215.73 | 3.55* 
= xt 2 17.27 | 
TXI | 3 | 28.10 | 
Ix L | 6 | 170.72| 2.81* 
TXLXxI | 6 |. 36.06| 
Error 120 | 60.74 

Total | 143 

*p < .05. 

**p < 01. 


The means, standard deviations, and 
standard errors of the mean for each 
of the three one-hour examinations 
and the final examination are reported 





importance of the difference in student 
performance which is attributable to 
the different methods of instruction 
can be obtained by noting that in all 
tests the mean difference is between 
A and .8 of the standard deviation. 

Analysis of the final examination 
(see Table 4) indicated that levels were 
significant (p < .05). The treatment 
effect was again highly significant (p 
< .01). New findings were that the 
instructor effect and the instructor X 
levels interaction effect were significant 
(both p < .025). 

To test for differences in responses 
to items on the course rating inventory, 
chi squares were used. Since the forms 
were completed anonymously, the sub- 
jects used in the analyses of variance 
could not be identified. Therefore, all 
students were included in this analy- 
sis, N = 122 for each group. The items 
on which the two groups differed at 
p < .05 are listed in Table 5. 


DIscussION 


The major question investigated in 
this study was: Do students obtain as 
much knowledge from the text as they 
do from the text and lectures on the 
textbook material. The analysis of 
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TABLE 5 
Irems From Course EvatvuaTion Form Yievpine Car Square p < .05 
Percentage of ‘‘true’’ 
Responses 
No. Item Text 
Lecture Nonlecture 
(N = 122) | (N = 122) 

16 | The organization of the course was more effective than most 51 34 

at MSU. 
17 | The teacher often did not talk loud enough. 6 0 
19 | Difficult concepts were explained in class. 79 83 
20 | The teacher ought to study more thoroughly before he 15 3 

comes to class. 
21 This class should meet more often. 4 57 
22 | This class should meet less often. 38 ll 
26 The teacher explained the material so thoroughly that it was 13 0 

not necessary to do much work outside of class. 
33 The teacher followed the book too closely. 23 5 
39 | During the term I have used my psychology for assignments 38 52 

in other classes. 








variance tests based on performance 
on each of the four examinations 
showed that there was a difference in 
knowledge as measured by test per- 
formance. Further, those who received 
lectures on the material did signifi- 
cantly better than those who did not 
receive lectures. The results of the 
analyses of Tests 1, 2, and 3 and of 
the final examination substantiate the 
superior performance of students who 
received lectures (see Tables 2 and 4). 
However, it is noted that the difference 
on Test 2 was not significant at the 
05 level. 

The material covered prior to the 
administration of Test 2 consisted 
largely of physiological information, 
e.g., the structure and functions of 
the human ear. The other two one-hour 
tests dealt with less specific material. 
Differences between the results of Test 
2 and the other tests were caused, per- 
haps, by the nature of the material on 
the tests. The hypothesis concerning 
the effect of differences in material is 
one which could profitably be the focus 
of a future study. 

Results obtained from Test 1 and 
Test 3 are similar and, in both cases, 





show the difference in test performance 
between the lecture group and the 
nonlecture group to be statistically 
significant (see Tables 2 and 3). The 
difference as a function of the levels 
was a result of the design of the 
study and simply means that brighter 
students did better than the duller 
students. The statistical differences 
obtained between the two teaching 
methods are of practical importance. 
In both cases the size of the difference 
between the groups (about 3.8 ques- 
tions) was greater than three-fourths 
of the standard deviation on the re- 
spective tests (see Table 3). If students 
under both conditions had been put 
into one distribution of scores for pur- 
poses of assigning grades, a student in 
the nonlecture group would have re- 
ceived, on the average, approximately 
a full letter grade lower than a student 
of equal grade point average in the 
lecture group. 

The results presented in Table 4 
concerned with the final] examination 
also indicate that the difference be- 
tween the lecture and nonlecture 
groups is statistically significant and 
of practical significance (difference 











greater than one-half the standard 
deviation or about 4.5 questions). Ta- 
ble 4 also shows that grade point 
average makes a difference. 

Other statistically significant differ- 
ences in Table 4 are the difference be- 
tween instructors and the difference as 
a function of the interaction of in- 
structors X levels. The latter results 
imply that not only are there differ- 
ences in the teaching abilities of in- 
structors but that individual instruc- 
tors differ in regard to the level of 
students whom they teach most effect- 
ively. However, these differences were 
not significant on any of the three 
preceding tests. A more likely factor 
which may account for these new find- 
ings is the different procedures fol- 
lowed in the administration of the final 
examination. The final examination 
administration covered a period of ap- 
proximately one week, thus increasing 
the opportunity for communication 
regarding the examination among stu- 
dents, and increasing the variability in 
study time. Each of the one-hour 
examinations was given to all students 
on the same day. The suggestion is 
made that the significant F ratio for 
the instructor effect and the significant 
F ratio for the interaction of instructor 
X level effect are a result of the un- 
tested variance introduced by the final 
examination schedule being con- 
founded with instructor variance. This 
suggestion is supported by the results 
of the first three analyses in which the 
variance due to scheduling was un- 
doubtedly smaller and in which the 
two mentioned F ratios was not sig- 
nificant. If the scheduling factor is not 
responsible for the significant F ratios 
found, a further study concerned with 
differences among beginning instruc- 
tors is suggested. The results of such 
a study would be especially useful for 
improving initial teaching performance 
by assigning beginning instructors to 
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classes at the instructor’s best level, 
or for improving the instructor’s long- 
term effectiveness by appraising him 
of the level at which he is most effec- 
tive. 

From the statistical analyses per- 
formed the inference is made that for 
introductory courses in psychology and 
similar subjects, students will manifest 
greater learning, as measured by formal 
tests, if they receive a series of lectures 
on the material in addition to reading 
the textbook. The results of this study 
also show that students with higher 
grade point averages do better than 
students with lower grade point aver- 
ages when all students receive similar 
instruction. 


Student Evaluations 


An analysis of questions presented 
in a course rating instrument was made 
to determine if there were any differ- 
ences between groups in feelings to- 
ward the course. The course rating 
inventories were filled out anony- 
mously by the students. All statements 
were answered “true” or “false.’”’ The 
results are based on virtually all the 
students who were in the classes rather 
than on those selected students who 
constituted the sample for the analyses 
made with regard to the examinations. 

The rating inventory was not spe- 
cifically developed for this study and 
contained questions which were made 
ambiguous by the design of the study. 
For example, one item stated: ‘The 
teacher followed the book too closely.” 
Since the study was designed so that 
instructors would not follow the book 
for the nonlecture group, the significant 
difference found between the two 
groups on this item was essentially 
meaningless. 

Two other considerations are pointed 
out in connection with the interpreta- 
tion of the results from the rating 
inventory. In some cases both groups 
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gave the majority of responses in the 
same direction. Thus, the difference in 
response frequencies is what is dis- 
cussed and not the question as such. 
For example, one item which stated, 
“Difficult concepts were explained in 
class.”’ yielded a significant chi square 
but with a majority of both groups 
answering “true.” The interpretation 
was made that significantly more of 
the nonlecture group than of the lec- 
ture group thought that difficult con- 
cepts were explained in class. The re- 
sult was not interpreted to mean that 
concepts were explained for the non- 
lecture group and not explained for 
the lecture group. Finally, the absolute 
frequencies are of importance in giving 
meaning to the significant chi squares 
obtained. For example, one item which 
stated, “The teacher often did not 
talk loud enough.”’ yielded a significant 
chi square. However, the frequencies 
obtained were 7 and 0. These low fre- 
quencies and the marked difference in 
exposure to the instructors render this 
finding of no practical significance. 

When the above considerations are 
applied to the results obtained from 
the rating inventory, there remain four 
items which indicate that there were 
differences between groups in feelings 
toward the course. 

Two of the items which gave signifi- 
cantly different results appear to deal 
with the teacher as a source of informa- 
tion. These items were 19 and 20. 
The direction of the differences ob- 
tained with both of these two items 
indicated that the nonlecture group 
was better satisfied with the instructor 
as a source of information than was 
the lecture group (see Table 5). This 
finding serves to refute the factor of 
variable skill of the instructors under 
different teaching conditions as an ex- 
planation of the better performance 
of the lecture group. 

Item 21, “This class should meet 
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more often.’”’ and Item 22, “This class 
should meet less often.” may be con- 
sidered as a satisfaction measure. If 
there had been no “true’’ answers for 
either item, the assumption would fol- 
low that the students were satisfied 
with the classes as they were. As the 
frequency of ‘‘true’”’ answers increases, 
the assumption follows that satisfac- 
tion with the number of class meetings 
is decreasing. Fifty-seven percent of 
the nonlecture group were not satisfied 
with the once a week meetings. Thirty- 
eight percent of the lecture group were 
not satisfied. A hypothesis for future 
investigation is that there is some 
optimal number of meetings per week 
between one and four which would lead 
to greater student satisfaction. 

From the results of the rating inven- 
tory, two suggestions appear. Two of 
the items considered together suggest 
that the instructors did not perform 
below their skill level when they were 
in the nonlecture situation. Two of the 
items suggest that student satisfaction 
with the course might be higher if more 
than one meeting per week but less 
than four meetings per week were held. 


SUMMARY 


This study compared two methods 
of conducting an introductory course 
in psychology, four lectures a week vs. 
one question and answer period a week. 
Each of four instructors taught two 
classes, one class by each of the two 
methods. All classes had the same 
reading assignments. 

An analysis of variance design was 
employed to determine the effects of 
different instructors, different meth- 
ods, and different levels of students, 
as measured by grade point averages, 
on the four examinations given. All 
interactions resulting from the analysis 
were examined and none was found to 
be consistently significant. The results 
showed that for either method students 





with higher grade point averages do 
better on the examinations than stu- 
dents with lower grade point averages. 

From the analysis it was concluded 
that students who receive instruction 
by the lecture method do better on the 
examinations than students who re- 
ceive instructions by the nonlecture 
method. The inference is made that 
lectures make a significant contribu- 
tion to a college student’s education. 

A chi square analysis of the course 
rating inventory suggested that more 
than one but less than four meetings a 
week would lead to greater student 
satisfaction with the course. 
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In 1954 the Department of Psychol- 
ogy of the University of Michigan be- 
gan supplementing its successful Hon- 
ors program for seniors with a similarly 
individualized program for beginning 
students in psychology. 
This paper deals with three problems 
| in connection with this program: (a) 
Was the special elementary tutorial 
instruction more effective than con- 
ventional instruction? (b) What types 
of students volunteer to participate in 
a tutorial program? (c) What types of 
students benefit or lose by participa- 
tion in a tutorial program? 


DESCRIPTION OF THE TUTORIAL 
PROGRAM 


The purposes of offering the tutorial 
program in the introductory psychol- 
ogy course were: (a) to offer the highly 
motivated student a chance to get a 
deeper coverage of one or two subareas 
of psychology of special interest while 
covering the material ordinarily of- 
fered in the course; (b) to offer special 
attention to those students who can 
do better work with more individual 
attention from the instructor than they 
can in the conventional combinations 
of lecture and discussion. 

The tutorial program itself was com- 
prised of two alternative programs: 
“regular” tutorial and “laboratory” tu- 
torial. In the Regular tutorial section, 
the semester was divided into three 
involving an _ increasing 


segments 
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amount of student initiative. During 
the first 5 or 6 weeks, the students read 
an assigned introductory textbook and 
met in groups of 15 students with an 
instructor for discussion. After this ini- 
tial period, students began the second 
major aspect of the course. For the re- 
mainder of the term each student read 
in some specific area of his choice, with 
the guidance of the tutor, wrote reports 
of his reading which were discussed 
with his tutor, and if he wished, met 
with one or more other students and the 
instructor for discussion. The third ma- 
jor aspect of the course was also car- 
ried on during the last two-thirds of 
the course but was particularly promi- 
nent in the last month of the course. 
Here, each student working individ- 
ually, or in a small group, designed, 
executed, and reported an experimental 
study. As a result of the great involve- 
ment in the research project we hoped 
to develop greater motivation for 
learning the experimental methods of 
studying human behavior. 

In the Laboratory tutorial section, 
the textbook was assigned and addi- 
tional readings were done in the ex- 
perimental literature. There was one 
2-hour seminar session each week to 
discuss certain topics and to design the 
experiment for the laboratory session 
which met each Saturday. In the lab- 
oratory session students participated 
in demonstrations of psychological re- 








TABLE 1 





= 


N 34 34 19 19 

Number of fresh- |21 (21 ll 
men 

ACE Percentile 
Rank 

Previous Grade 
Point Average (A 
= 4,B = 3, etc.) 


@ 


64.0 69.5 (77.5 |80.0 





Note.—Tiv = Tutorial initial volunteers 
Tlv = Tutorial late volunteers 
Cnv = Conventional nonvolunteers 
Clv = Conventional late volunteers 


search and finally carried out research 
of their own. 

Students in conventional sections at- 
tended two lectures a week and 2 hours 
of discussion section. 


EXPERIMENTAL DESIGN 


Goals. Since the tutorial program repre- 
sented a radical (and more expensive) de- 
parture from our conventional teaching 
methods, we wished to evaluate its values in 
terms of the goals of the course and to deter- 
mine, if possible, what types of students 
gained most from this experience. The goals 
of our course have previously been described 
in detail (Beardslee, DeValois, Dulaney, Mc- 
Keachie, & Winterbottom, 1954). Briefly 
they include gains in knowledge, ability to 
apply principles of psychology, understand- 
ing of the use of scientific method in psychol- 
ogy, interest in learning more about psychol- 
ogy, positive attitudes toward psychology, 
favorable attitudes towards science, and in- 
terest in academic work. Finally, we hope 
that our students will learn to regard the be- 
havior of others in a less dogmatic, judgmen- 
tal fashion. We attempted to develop meas- 
ures of each of these objectives and will 
describe each as we present the results ob- 
tained with it. 

Sample. The tutorial students were en- 
rolled on a voluntary basis. Thirty-four stu- 
dents volunteered for tutorial sections at 
registration—we refer to them as the initial 
volunteer group (Tiv). These were matched 
with 34 students in the conventional sections 
on the basis of sex, class level, grade point 
average, and ACE score. These latter 
matched students we will call the conven- 
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tional nonvolunteer group (Cnv). During the 
first week of classes, 38 more students applied 
for tutorial sections. We split these late vol- 
unteers into two groups with 19 students in 
each group, again matching on the basis of 
sex, class, and ACE scores. One group was as- 
signed to the tutorial section (Tlv group) 
and the other to the regular section (Clv 
group). 

The medians of the ACE score and the 
grade point average of each group are indi- 
cated in Table 1. 

Instruction. The general plan of the tu- 
torial sections was similar to that described 
in the introduction. The tutors were regular 
members of the general psychology staff se- 
lected on the basis of their interest in tu- 
torial teaching. Since each teaching fellow 
and instructor in the course is allowed free 
choice of texts, these varied, but those used 
in the tutorial section were among those used 
in control group sections. 


MEASURES AND RESULTS 


In the interest of economy of pres- 
entation our results will be discussed 
with a description of measures in terms 
of each category of the objectives 
which the elementary course is seeking 
to achieve. When possible, tests were 
given both at the beginning and end of 
the course in order that our dependent 
variable might be gains. 


Cognitive Domain 


Content in Psychology. In the area 
of psychological knowledge and skills, 
Psychology 31 (the course in this 
study) seeks to foster the acquistion of 
specific content in psychology includ- 
ing knowledge of terminology, specific 
facts, conventions, principles and con- 
cepts, and theories. 


A multiple-choice achievement test with 
40 items covering various areas of psychologi- 
cal knowledge was given at the beginning of 
the semester in order to get a measure of 
what the student already knew about psy- 
chology. The initial volunteer group proved 
to be significantly inferior to their controls 
on this test while the differences between 
the late volunteer groups were not signifi- 
cant. 

The Core Final was the examination used 
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in all sections of general psychology as a 
portion of the final examination. It consisted 
of two parts, both multiple choice in form. 
Part I was made up of items concerning 
knowledge of the content of the several areas 
of introductory psychology. Part II was a 
measure of scientific thinking in psychology 
and involved working with experimental data 
to determine valid conclusions, detecting im- 
plicit assumptions, and making judgments 
about what kinds of questions are open to ex- 
perimental investigation. 


Since control students were superior 
to tutorial students on the pretest, we 
were interested in examining gains in 
knowledge. We used as the gain score 
the raw score difference between the 
psychology test taken at the beginning 
of the semester and the final (Part I). 
The comparisons of mean gain scores 
are indicated in Table 2. 

The results show that conventional 
students gained significantly more than 
tuterial students as measured by the 
Core Final. No significant differences 
were found between the Laboratory tu- 
torial and the Regular tutorial groups 
in terms of core final performance com- 
parisons. Nor were significant differ- 
ences found in scores on the scientific 
thinking items. Thus as measured by 
our final examination, tutorial students 
learned less than students in conven- 
tional section. 

Possible explanations for the inferior 
performance of the tutorial students 
include: 

1. Since the final examination was 
given less weight in tutorial sections, 
tutorial students may simply have been 
less motivated to study for the exami- 
nation. 

2. The tutorial students in compari- 
son with the conventional students 
might have been less skillful than con- 
ventional students in taking multiple 
choice exams in psychology since they 
had had less practice. 

3. The conventional students used 
as control group here were given spe- 
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TABLE 2 
Gains IN KNOWLEDGE 








Control 
—_—_———_|Diff.| 3 


| M | SD M SD | 


| 
Tutorial 





| | 
Initial Vol- |7.55/7.00)11.23/6.73/3 .68/2.18* 
unteers 
Late Volun- |7.67|5.52)12.38|5.77 
teers | 





4.71/2.43* 











*p< 05. 


cial treatment, at least as compared 
with other students in the course. They 
were aware of the fact that their per- 
formance was being observed. This 
awareness may have stimulated them 
to make a greater effort. While it is 
true that the tutorial students were 
given special attention, their additional 
motivation was not so likely to be di- 
rected into the content area covered by 
the Core Final. 

4. The conventional classes were 
centered on the text to a large extent. 
Class periods often focused on discus- 
sion of points in the textbook. Thus the 
students in conventional classes spent 
more time and effort on text materials 
(which were the materials covered on 
the achievement test). Thus the con- 
ventional classes, focusing on the ob- 
jective of teaching content, were su- 
perior in achieving this objective. 

Application of Psychological Prin- 
ciples. The Margaret and Sherriffs test 
“How would you handle it?” was used 
as a measure of curiosity about human 
behavior and the ability to apply psy- 
chological principles of life situations. 
No significant differences were found 
between the conventional and tutorial 
groups. 

In answer to the question, “On the 
whole, do you feel that taking Psy- 
chology 31 helped or hindered you in 
your other courses?” differences were 
found between the Tiv and the Cnv, 
and between the Tiv and Tlv groups. 
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TABLE 3 
ATTITUDE TOWARD CouRsE (MEDIANS) 
Attitude Tiv Cnv Tlv Clv 

Final course evaluation (high = 1, 1.81 2.83 1.65* 3.13* 
low = 5) 

Stimulation of course (high = 1, 1.86 3.12 1.35* 3.14* 
low = 6) 

Amount of time devoted to course 2.15 1.02 2.18* 1.00* 
(ratio of hours outside class to 
credit hours) 

Average number of psychology courses 9.50 5.89 10.00 7.50 
student would like to take 

Average number of concentration psy- 2.41 1.29 3.62 2.13 
chology courses student would like 
to take 

Average number of psychology courses 2.00 0.21 1.70 1.60 
student plans to take 

Number of references planned to read 3.00 1.50 2.33 1.75 
in the future 

Perceived influence of course in the 94 .25 1.20 1.08 


choice of major and _ vocation 
(high = 4, no influence = 0) 





* Sign test comparing Tlv and Clv significant at 1% level. 


The Tiv students reported a more fa- 
vorable influence of Psychology 31 
upon other courses. 

Ability to Think Critically. In addi- 
tion to the scientific thinking section of 
the final examination students were 
asked to read a fabricated experiment 
and then to state what conclusions 
could be drawn and why these conclu- 
sions were justified. In addition they 
were asked for criticisms of sampling 
measures, etc. No significant differ- 
ences were found between groups and 
no changes occurred during the course 
on this measure. 

Tendency to Describe Behavior in 
Objective, Nonevaluative Terms. Our 
measure of nonevaluative tendency 
was a scale developed by Carol Slater 
in which the student is asked to choose 
one from a list of evaluative and non- 
evaluative words to complete descrip- 
tions of behavior. No significant dif- 
ferences were found on this scale. 


Attitudinal Domain 


Motivation for Continued Learning. 
The amount that can be taught in any 





single course is so small in relation to 
the student’s total learning that one of 
our major objectives must be to create 
motivation for continued learning. As 
indices of such an outcome we used a 
number of measures. 

Our results were clear-cut, as indi- 
cated in Table 3. Tutorial students 
thought the course to be more stimulat- 
ing and valuable than did control stu- 
dents. Students also spent more time 
out of class on the course, although this 
was partially compensated for by less 
class time. While the tutorial students 
were not significantly different from 
the control group on our other inde- 
pendent measures of motivation, the 
difference was in the same direction on 
each of the other four measures. Fur- 
thermore less well-controlled studies in 
two other semesters also supported the 
finding that as compared with other 
students, tutorial students have more 
favorable attitudes toward the course, 
read more books, and plan to take more 
psychology courses (particularly the 
core courses). 

Willingness to Accept Responsibility 
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for One’s Own Progress. lf we want our 
students to continue learning after 
leaving the course, we must not only 
interest them in psychology but also 
must try to develop traits of independ- 
ence and initiative to continue without 
instructor direction. As an attempt to 
measure this we used a scale developed 
by Patton (1955) to measure concep- 
tions of the student’s and the teacher’s 
roles in learning. There were no signifi- 
cant differences in gains on this test. 

Our second measure in this area was 
an assignment in which students were 
given a problem too difficult to solve 
with previously acquired information. 
Not only were the answers graded, but 
students were asked to tell how they 
got the information necessary to solve 
the problem. While there was not a 
significant difference in the number of 
right answers, tutorial students con- 
sulted more resources in answering the 
question. (We are not sure whether this 
indicates greater resourcefulness or less 
independence!) 

Attitudes toward Psychology, Sci- 
ence, and Academic Work. To measure 
student attitudes toward psychology 
we used a locally devised Likert-type 
scale. Neither group made significant 
gains on this or on a scale designed to 
measure student attitudes toward sci- 
ence. 

We had used scores on three scales 
of the California Psychological In- 
ventory (Gough, 1956)—Achievement 
via Conformance, Achievement via In- 
dependence, and Psychological Mind- 
edness—primarily as individual differ- 
ences measures but we were also 
interested in changes on these scales. 
Tutorial students made significantly 
greater gains on the Achievement via 
Independence scale than did control 
students, and other differences were 
nonsignificant. 

Authoritarianism. Although the 
course produced a significant decrease 
in authoritarianism, there was no sig- 
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nificant difference between groups in 
amount of change on the California F 
Scale. 


What Kind of Student Volunteers for 
Tutorial Work? 


In selecting students for tutorial 
work we had relied upon the act of 
volunteering for a program pictured as 
extra work as an index of motivation to 
succeed in the tutorial program. What 
sort of student volunteers for such an 
experience? 

Intellectual Characteristics. Since 
our groups were matched on ACE and 
GPA, the preceding tables do not re- 
veal whether volunteers for the tutorial 
sections tend to be the intellectually 
superior or inferior students. However, 
when we compare them with all stu- 
dents in conventional sections, it ap- 
pears that the students who volun- 
teered for tutorial work were not 
significantly different from nonvolun- 
teers in ACE scores. The tutorial stu- 
dents, however, tended to know less 
about psychology at the beginning of 
the semester than did nonvolunteers of 
equal intelligence. 

Attitudes and Personality Charac- 
teristics. When volunteers for tutorial 
sections were compared with nonvolun- 
teers matched for intelligence, sex, and 
year in college (Cnv), no significant 
differences were found in authoritari- 
anism, anxiety, or extraversion. Nei- 
ther were the volunteers significantly 
different in attitudes toward psychol- 
ogy, science, or responsibility for edu- 
cation. However, the nonvolunteers 
tended to have higher academic moti- 
vation than the initial volunteers, as 
measured by the CPI. In addition, data 
on six men in tutorial who had taken 
the TAT indicated that all six were 
below the median of nonvolunteers on 
need for Achievement and four of the 
six were above the median of nonvolun- 
teers on need for Affiliation. 

Summary of Characteristics of Vol- 
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unteers. We were surprised to find so 
few differences between the students 
volunteering for tutorial and the stu- 
dents who did not volunteer. The dif- 
ferences in knowledge of psychology, 
and academic motivation suggest that 
the nonvolunteers were somewhat more 
conventionally motivated toward get- 
ting good grades than were the volun- 
teers, although these differences are 
certainly not marked. The suggestive 
finding that a small sample of tutorial 
students who had taken the TAT were 
low in need for Achievement suggests 
that the motivation for volunteering 
was probably not achievement. Since 
the volunteers do not appear to be self- 
selected on any narrowly limited basis, 
an analysis of personality characteris- 
tics related to performance in tutorial 
should not be limited by the selectivity 
of the sample. Thus, we can now turn 
to an analysis of which kinds of stu- 
dents profit most from our tutorial pro- 
gram. 


What Kind of Student Succeeds in Tu- 
torial Work? 


Although we were reasonably well- 
satisfied with the over-all performance 
of our tutorial students, we were hope- 
ful that our battery of tests would help 
us identify characteristics which could 
be used in the selection of those stu- 
dents who would profit more from tu- 
torial sections than from conventional 
sections. 

Hypotheses. We had four hypotheses 
about likely differentiating variables. 
One was that students with greater in- 
telligence are likely to be held down in 
conventional classes by the necessity of 
pacing the class for the average stu- 
dent. Hence these students would gain 
most from tutorial instruction. Our 
second hypothesis was that conven- 
tional academic motivation as meas- 
ured by the CPI scale would lead to 
better performance in conventional 
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sections than in tutorial groups, since 
the conventional instruction is more 
nearly like that upon which the scale 
was validated. Our third hypothesis 
was that students with strongly posi- 
tive attitudes toward psychology and 
science would do better in tutorial 
groups. Finally we suspected that the 
close, individual relationship with a tu- 
tor might hold special values for the 
student high in extraversion so that 
such students would do well in tutorial 
sections. 

Results. Our results with these anal- 
yses were disheartening. We compared 
the difference between differences in 
achievement for students high and low 
in each of our personality variables. 
Only 3 of the 30 differences tested for 
significance were significant at the 5% 
level. These three findings were: 

1. Students high in authoritarianism 


as measured by the F Scale are less | 


likely to do poorly on the final exami- 
nation in the tutorial sections than are 
low F students. 








2. Students high in intelligence as | 


measured by the ACE make relatively 
greater gains on the Scientific Thinking 
Test in the control sections while low 
ACE students make greater gains in 
the tutorial groups. 

3. Students high in individual re- 
sponsibility as measured by the Patton 
scale tend to become more authoritar- 
ian in the tutorial sections while low 
responsibility students in both classes 
and high responsibility students in con- 
ventional classes became less authori- 
tarian. 

Since so many tests of significance 
were made, we lack confidence in the 
reliability of these results, since they 
may well have been due to chance. 

Analysis of student ratings of the 
course failed to reveal any significant 
interaction which affected satisfaction. 
This was rather surprising to us since 
we had expected satisfaction to be a 
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more sensitive indicator than the final 
examination. Such surprises were com- 
monplace in this study. 


DISCUSSION 

Was the tutorial program successful? 
One’s answer to this question depends 
upon his values. Clearly if one’s chief 
goal in the first course is mastery of 
the material conventionally covered in 
a textbook in elementary psychology, 
spending the full semester on this con- 
tent pays off in a higher score on the 
final examination. On the other hand, 
if one sees the first course as a device 
for stimulating interest and increasing 
motivation for learning more psychol- 
ogy, the tutorial program was success- 
ful,! particularly if one considers the 
fact that the volunteers for tutorial sec- 
tions were probably originally some- 
what less motivated for conventional 
academic goals than their controls. 
Perhaps a program could be devised 


* Interesting in this connection is a follow- 
up we recently did of the students involved 
in our earlier study (Guetzkow, Kelly, & 
McKeachie, 1954) of three teaching methods. 
In that study the highly structured recitation 
method proved to be superior for student 
achievement, but the follow-up showed that 
no male students of the recitation sections 
majored in psychology, while five and six 
males, respectively, from the more permissive 
tutorial and discussion sections majored in 
psychology. 
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which accomplishes both objectives. At 
any rate we are still trying. 
SUMMARY 

The relative effectiveness of a spe- 
cial tutorial teaching was compared 
with that of conventional lecture-dis- 
cussion. The results indicate that the 
conventional method for teaching was 
superior to the tutorial method in com- 
municating information as measured 
by a multiple-choice examination, but 
the differences were not significant on 
other measures of cognitive outcomes. 

The tutorial students, however, were 

more favorable in their ratings of the 

course and on other motivational meas- 
ures than were conventional students. 
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THE RELATION BETWEEN TEACHERS’ BACKGROUNDS 
AND THEIR EDUCATIONAL VALUES! 


W. CODY WILSON AND GEORGE W. GOETHALS 


Harvard University 


The investigator of teachers’ educa- 
tional values is confronted by three 
theoretical viewpoints, each explicating 
a different set of influences affecting 
the values of any given occupational 
group. These three foci of emphasis 


are: preadult socialization, occupa- 
tional selection, and _ professional 
socialization. 


Linton (1945), although he does not 
concern himself with questions of 
occupational values, has presented 
what is probably the most explicit 
statement of the effects of preadult 
socialization. He holds that an indi- 
vidual’s values are intimately related 
to his previous experience, particu- 
larly the experiences he has undergone 
earlier in life during the process of 
socialization within the family. From 
this he draws the corollary that indi- 
viduals who have had similar early 
experiences will have similar values; 
and, conversely, those individuals who 
have had different early experiences 
will have different values. One would 
expect, according to this position, that, 
in the absence of other influences, 
teachers who have had different early 
experiences would also be differentiated 
in terms of their values. 

The problem of occupational choice 
and the concomitant problems of 
occupational selection have received 


1This research was supported by the 
Milton Fund of Harvard University and by 
the School-University Program for Re- 
search and Development of the Harvard 
Graduate School of Education which derives 
its support from the Ford Foundation. 

An earlier version of this paper was read 
at the American Psychological Association 
meeting at Cincinnati, Ohio, in September 
1959. 


considerable discussion in the last 
decade (Ginzberg, Ginsburg, Axelrad, 
& Herma, 1951; Roe, 1956; Super, 
1953). In general, the viewpoints that 
emphasize occupational selection hold 
that persons who have similar values 
will select similar occupations, or, 
conversely, that persons who are in 
similar occupations will have similar 
values. In particular, Warner and his 
associates (Warner, Havighurst, & 
Loeb, 1944) propose that one of the 
central problems in American educa- 
tion generally, and in teaching par- 
ticularly, is that teachers usually 
either come from the middle class or 
adopt, for reasons of social mobility, 
values which are predominantly repre- 
sentative of the middle class. From this 
viewpoint one should expect that, be- 
cause of the factors implicit in the 
selection process, teachers with dif- 


ferent background experiences would | 


not have accompanying differential 
values. 

The effects of professional socializa- 
tion in the area of values were pointed 
out over 30 years ago by Hughes (1928) 
and the more recent literature in this 
area has been reviewed by Nichols 
(1959). This viewpoint suggests that 
the process of socialization for the 
professional role and its attendant 
pressures, both in training and on the 
job, should result in considerable 
consensus among the members of any 
given occupational group. On the basis 
of this formulation, one should expect 
that background experiences not 
directly related to professional sociliza- 
tion would have little effect upon the 
educational values of teachers, but that 
differences in the professional socializa- 
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tion experience would indeed be ac- 
companied by differences in educa- 
tional values. 

The data reported in this study 
provide partial tests of all three of 
these positions. 


MeEtTHOD 


This research used the correlational 
method with two sets of variables from a 
longer questionnaire. 

Antecedent Variables. This study inves- 
tigated six dimensions as indicators of 
previous experience that might be rela- 
ted to the educational values of teachers: 
sex, socioeconomic status as measured by 
father’s occupation, religion, urban-rural 
background, type of college education, and 
amount of teaching experience. Each of 
these variables was dichotomized; that is, 
males were compared with females, upper- 
level occupations were compared with 
lower-level ones,? Catholics were compared 
with Protestants, those from relatively 
rural backgrounds contrasted with those 
from urban surroundings, those whose un- 
dergraduate education was in a teacher- 
training setting were compared with those 
whose education had been in a liberal arts 
college, and those who had had 5 years or 
less teaching experience were compared 
with those who had had more than 5 years 
experience. 

The six antecedent variables are in gen- 
eral independent of each other in the sam- 
ples used in this research. The one excep- 
tion is that sex is related to type of college 
education—males are more likely to have 
been educated in a liberal arts situation. 

Consequent Variables. The consequent 
variables in this study were 61 statements 
employing the verb “‘should”’ pertaining to 
educational practices. These ‘‘value’’ state- 
ments covered 10 general areas of educa- 
tional behavior: teacher-pupil relations, the 
teacher’s role, educational goals, teaching 
methods, teachers’ relations with fellow 
teachers, the role of the teacher’s immedi- 
ate superior, the policy making process, the 
principles of salary determination, the 
means of recognition for superior teaching, 
and the teacher’s relations with parents. 








2 The occupational classification scheme 
presented by Anne Roe (1956) was used for 
this purpose; Levels 1, 2, and 3 were com- 
pared with Levels 4, 5, and 6. 
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The teachers were asked to respond to 
each statement by indicating the degree to 
which they agreed or disagreed with it. The 
response scale covered six points from 
“strongly agree’ to “strongly disagree.” 
Some examples of the value statements that 
teachers responded to are :* 

The primary focus of the teaching job 
should be to guide and assist the student 
in learning activities and experiences. 

Teachers should treat students as 
equals, not as subordinates, even in the 
classroom. 

Teachers’ salaries should be based on 
the amount of education or the number 
of degrees possessed by the teacher. 
Subjects. A total of 280 teachers in three 

public school systems in the area of metro- 
politan Boston responded to the question- 
naire. The sizes of the subsamples were: 77 
from the high school in Community Y; 66 
from the secondary schools of Community 
X; and 137 from the elementary schools of 
Community X. Both Community X and 
Community Y are predominantly upper 
middle-class suburbs with a high proportion 
of their students continuing education be- 
yond the secondary level. Both communi- 
ties take great pride in the considerable 
support they give to their public school 
systems, and both systems have a local and 
national reputation for excellence. 

Analysis. The null hypothesis, that there 
was no relation between the antecedent and 
consequent variables, was tested separately 
for each of the six antecedent variables. 
The following procedure was used: the rela- 
tion between each antecedent variable, for 
example, sex, and each of the 61 value items 
was tested by the means of chi square. The 
number of chi square values that had a 
probability of occurrence of less than .10 
was observed, and this was compared, again 
by means of chi square, with the number 
expected by chance. The null hypothesis 
was rejected if the probability of this latter 
chi square was less than .05. Since it had 
previously been established that differences 
in educational values existed between the 
three school systems and especially between 
the elementary and secondary levels, this 
was held constant; that is, chi square for 
the antecedent variable and each value 
item was computed separately for each of 
the three subsamples and then combined 

for the test of relationship between the 
antecedent and the consequent value item. 





* The complete list of value items may be 
found in Wilson and Goethals (in press). 
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TABLE 1 


NUMBER OF VALUES RELATED TO Eacu 
ANTECEDENT VARIABLE 























Number of 
Values 
Variable | FE: 3 8 =. 
£ | B34 
a2. 
Sex (male vs. female) 5 | 6.1 .80 
Father’s Occupation 0 | 6.1 | 6.78* 
(high status vs. low 
status) 
Religion (Catholics vs. | 12 | 6.1 | 6.33* 
Protestants) 
Type of Community | 11 | 6.1 | 4.38* 
Background (rural vs. 
urban) 
College Education (lib- | 12 | 6.1 | 6.33* 
eral arts vs. teacher 
training) 
Teaching Experience 11 | 6.1 | 4.38* 
(experienced vs. inex- 
perienced) 
*p< .05. 


RESULTS 


A significant relationship between 
antecedent variables and educational 
values was found for four of the six 
antecedents: religion, rural-urban 
background, type of undergraduate 
college, and amount of teaching experi- 
ence. Significantly fewer relationships 
between socioeconomic status and 
educational values were found than 
would be expected by chance. Finally 
there was no overall relationship be- 
tween sex and educational values. 

Although a recent review of perti- 
nent literature by Tancock (1960) has 
suggested that boys and girls in our 
society undergo different experiences 
in socialization, there were only 5 of the 
possible 61 value items (p > .25) for 
which the sex of the respondent made 
any difference. Three of these five dif- 
ferences, however, were concentrated 
in one area—the means of recognition 
for superior teaching (see Table 2). 
Therefore, it seems probable that there 


are real differences between the sexes 
in terms of how they wish to be re- 
warded for outstanding performance 
in the teaching role. Females seem to 
feel that personal satisfaction is reward 
enough, but males want more tangible 
reward, such as increased salary or 
praise and public recognition. 

There were no differences whatever 
between high and low socioeconomic 
status as measured by father’s occupa- 
tion on any of the 61 value items. This 
is significantly (p < .01) fewer differ- 
ences than would be expected by 
chance. 

Protestants and Catholics differed 
on 12 of the 61 value items (p < .05). 
More differences were found in the 
areas of educational goals and the 
policy making process. Protestants 
tended to be more oriented toward 
intellectual interests, while the Catho- 
lics were oriented toward transmission 
of cultural values and vocational 
objectives. Catholics wanted teachers 
to take the initiative in the policy 
making process, but Protestants were 
willing to leave policy up to their 
superiors. 

Individuals who came from a rela- 
tively rural background differed from 
those with a relatively urban back- 
ground on 11 of the 61 value items 
(p < .05). These differences tended to 
concentrate in the area of salary deter- 
mination. Those with an urban back- 
ground wanted salary based upon 
education and experience, whereas 
those from a rural background desired 
it to be based upon teaching ability. 
A further pattern emerges when the 
content of the other items on which the 
two groups differed are examined. This 
pattern is suggestive of Miller and 
Swanson’s (1958) distinction between 
the entrepreneurial and bureaucratic 
personality types. That is, those who 
come from a rural background tend to 
be more individual and risk taking in 
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TABLE 2 
RELATION BETWEEN BACKGROUND VARIABLES AND VALUE ITEMS 


TEACHERS’ BACKGROUNDS AND EDUCATIONAL V4tUES 











Value Items 


Background Variables 





Sex® 





Teacher relationship with pu- | 


pils (4 items) 
Friendly but reserved 


Treat as equals, not as sub- 


ordinates 
Role of teacher (6 items) 
Serve as character model 
Guide and assist learning ex- 
periences 
Evaluate progress and moti- 
vate dilatory student 
Impart subject matter 
Control students 
Educational goals (9 items) 
Develop love of and interest 
in learning 
Form character 
Transmit cultural values 
Develop social skills 
Prepare for a vocation 
Principles of salary determina- 
tion (6 items) 


Number of duties and re- 
sponsibilities 
Achievement and _ perform- 


ance of students 
Total years of teaching exper- 
ience 
Amount of education or num- 
ber of degrees 
Methods of policy formation (7 
items) 
Planned cooperatively by all 
teachers 
Initiated by teachers and car- 
ried out by immediate su- 
perior 
Initiated and implemented by 
immediate superior 
Planned and carried out by 
elected committee of teach- 
ers 
Methods of acknowledging out- 
standing teaching (5 items) 
Increase in salary 
Praise and public recognition 
Increased autonomy in teach- 
ing activities 
Personal satisfaction only 
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M > F* 
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Religion | Background® Education* 
| | 
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TABLE 2—ConTINvUED 








Background Variables 





Value Items 





Sex* Religion® | pow en | Edueation® rt —* 
Relationship between teachers 
and parents (5 items) | 
Teachers visit parents in ns ns > OU ns [<a 
home 
Teachers and parents become ns ns R < U** | A> E* ns 
acquainted only when a 
problem arises 
Relationship among teachers (4 
items) 
Informal nonprofessional con- ns ns ns A>E* |I>E* 
versation 
Informal discussion of pro- ns ns ns ns ae 
fessional problems 
Regularly scheduled meetings ns r>¢ | ns ns .<2" 
only 
Minimum of professional and ns ns ns A < E** | I < E* 
personal interaction 
Teaching methods (8 items) 
Small group discussion ns ns ns a? 2 ns 
Teacher-supervised study and ns ie <> ns ns ns 
practice 
Independent study M > F** | ns R> U**| ns ns 
Assign-study-recite ns ns ns A> E** ns 
Role of teacher’s immediate su- 
perior (6 items) 
Serve as intermediary be- ns ie <€ori a <o™ ns | I> E* 
tween teachers and admin- 
istration 





Note.—Socioeconomic status is omitted since no reliable differences were found. 

® In this table the following abbreviations are used: M for male and F for female; P for Protestant and C for Catho- 
lic; U for urban and R for rural; E for education and A for liberal arts; I for inexperienced and E for experienced 

> The entry “ns” in a cell indicates that the relationship is not significant at the .10 level. 


* Significant at .10 level. 
** Significant at .05 level. 
*** Significant at .01 level. 


their orientations, whereas a person 
from a more urban setting tends to be 
oriented more toward hierarchical 
organizational settings. Indeed, one of 
Miller and Swanson’s most important 
criterion in their study was “having 
been born on a farm.” 

Teachers who were educated in a 
teacher-training setting differed from 
those educated in a liberal arts college 
on 12 of the 61 value items (p < .05). 
These differences tended to reflect a 
difference in conception of the role of 
the teacher. The liberal arts educated 





are more concerned with guiding and 
assisting the learning process and are 
more oriented toward assign-study- 
recite and small group discussion, 
while the education trained are ori- 
ented more toward discipline and 
control of the students. The liberal arts 
trained want informal interaction 
among teachers and more autonomy 
in the classroom, but are willing to 
leave general policy decisions to their 
superiors; on the other hand the educa- 
tion trained desire a minimum of inter- 
action among teachers. 
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TEACHERS’ BACKGROUNDS AND EDUCATIONAL VALUES 


Experienced teachers differed from 
less experienced teachers on 11 of the 
61 value items (p < .05). These dif- 
ferences were mostly concentrated in 
the area of teacher relationships. The 
less experienced teachers want more 
opportunities within the school setting 
for interacting with their fellow teach- 
ers on both a professional and non- 
professional level, than do the more 
experienced teachers. It should be 
noted, however, that teaching experi- 
ence is the variable most subject to the 
influence of selection, and that the 
proper interpretation of this finding 
may be not that teachers desire less 
interaction with more experience, but 
that those teachers who want con- 
siderable interaction with their peers 
and do not find opportunities for it may 
leave the occupation. Other evidence 
(Wilson & Goethals, in press) suggests 
that the latter explanation is prefer- 
able. 

The results as a whole indicate that 
although differences among teachers 
in background experiences were asso- 
ciated with differences in educational 
values, the relationships were not large. 
This was especially true in the value 
areas of educational goals, the role of 
the teacher, and teaching methods. 


DISCUSSION 
With reference to the foci of interest 
of the theoretical positions delineated 
earlier, four of the six antecedent 
variables—sex, socioeconomic status, 
religion, and rural-urban background— 
are generally considered to be asso- 
ciated with differential preadult sociali- 
zation experiences within the family; 
the other two—type of college educa- 
tion and amount of teaching experience 
—are more directly related to profes- 
sional socialization. 
The theoretical position emphasizing 
preadult socialization would predict an 
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association between values and the 
antecedent variables sex, socioeco- 
nomic status, religion, and rural-urban 
background. These predictions would 
be correct for the variables religion and 
rural-urban background, but would be 
wrong for the variables sex and socio- 
economic status. Thus these data 
provide only partial confirmation of 
this position. 

The theoretical position emphasizing 
occupational selection would predict 
no relation between the antecedent 
variables and educational values. This 
prediction would be correct for the two 
variables sex and socioeconomic status, 
but wrong for the other four variables. 
Two points must be noted, however: 
first, the relationships that were found 
were not very large; and secondly, 
there were significantly fewer relation- 
ships than would be expected by chance 
for the antecedent variable socioeco- 
nomic status—the variable that War- 
ner and his associates specifically 
analyzed. Thus these data provide a 
partial confirmation also for this 
position. 

The theoretical position emphasizing 
professional socialization would predict 
an association between educational 
values and the two variables type of 
college education and amount of 
teaching experience, but not for the 
other antecedent variables. This pre- 
diction would be correct for the vari- 
ables type of college education, amount 
of teaching experience, sex, and socio- 
economic status, but not for religion 
and rural-urban background. Thus the 
data presented above also provide 
partial confirmation for this theoretical 
position. 

In conclusion, preadult socialization, 
occupational selection, and professional 
socialization are all related to the 
educational values of teachers. Any 
adequate theoretical formulation 
within the area of occupational values 
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must take these three foci of the indi- 
vidual’s experience into consideration. 


SUMMARY 


The investigator of teachers’ values 
is confronted by three theoretical 
positions each focusing on different 
segments of the individual’s life his- 
tory: preadult socialization, occupa- 
tional selection, and _ professional 
socialization. Data, concerning the 
relationship between six background 
variables and educational values, from 
280 teachers in three school systems are 
presented. The two variables sex and 
socioeconomic status are not related 
to educational values; the four vari- 
ables religion, rural-urban background, 
type of college education, and teaching 
experience are related to educational 
values. These results are related to the 
three theoretical positions, and each 
receives partial confirmation. The 
authors conclude that each of the foci 
is relevant to educational values and 
that all three must be considered in any 
adequate theoretical formulation 
within the area of occupational values. 


REFERENCES 
GinzBErG, E., Ginsspure, 8S. W., AXELRAD, 
S., & Herma, J. L. Occupational choice: 
An approach to a general theory. New 
York: Columbia Univer. Press, 1951. 





W. CODY WILSON AND GEORGE W. GOETHALS 


Huaues, E. C. Personality types and the 
division of labor. Amer. J. Sociol., 1928, 
33, 754-768. 

Linton, R. Foreword. In A. Kardiner, 
The psychological frontiers of society. 
New York: Columbia Univer. Press, 
1945. 

Mituer, D. R., & Swanson, G. E. The 
changing American parent. New York: 
Wiley, 1958. 

NicHo.ts, Irene A. A review of the litera. 
ture relating to the socialization of 
persons for professional rcles. Unpub- 
lished special paper for Committee on 
Doctoral Study, Harvard University 
Graduate School of Education, 1959. 

Rog, ANNE. Psychology of occupations. New 
York: Wiley, 1956. 

Super, D. E. A theory of vocational de- 
velopment. Amer. Psychologist, 1953, 8, 
185-190. 

Tancock, CATHERINE B. A critical survey 
of the research literature on the differ- 
ential treatment of boys and girls 
during the process of socialization in 
contemporary American society. Un- 
published special paper for Committee 
on Doctoral Study, Harvard University 
Graduate School of Education, 1960. 

Warner, W. L., Havicuurst, R. J., & 
Logs, M. B. Who shall be educated? 
New York: Harper, 1944. 

Wison, W. C., & Goretuats, G. W. Sources 
of potential tension in the American 
public educational system: A _ field 
study. In W. Allinsmith & G. W. 
Goethals (Eds.), The role of the school 
in mental health. New York: Basic 
Books, in press. 


(Received April 5, 1960) 








Jour 
1960, 


mer 


Onc 
pers 


Journal of Educational Psychology 
1960, Vol. 51, No. 5, 299-304 





d the THE EFFECT OF ANXIETY ON ACADEMIC 

7 ACHIEVEMENT? 

aa ROBERT R. GROOMS ano NORMAN S. ENDLER’ 

Press, Pennsylvania State University 

. The . ; oe 
York:| Anxiety and achievement are im-_ this variable, and the resultant unidi- 


: portant constructs in our society. They 
—e are especially relevant in a college 
oh community where students are re- 
ee on | warded for achievement and punished 
ersity | for failure, and where these reinforce- 
59. | ments may alleviate or induce anxiety. 
Hew One of the prime functions of an 
1 de. | educational institution is to manipu- 
53,8, | late these reinforcements so as to 
maximize achievement and minimize 
urvey | the effects of debilitating anxiety. Ed- 
liffer- . ~ d 
girls | ucators and psychologists alike have 
on in | recently recognized the cogent effects 
Un- | of anxiety on achievement in various 
1ittee | situations. 
—Yg Until recently many investigators 
|, & | and theorists have treated anxiety as 
ated? | & unidimensional personality trait re- 
siding within the individual, the ef- 
urces | fects of which could be measured but 
“held | 2Ot manipulated nor controlled. This 
_ w. | led to a dead end because one could 
;chool | merely state that people were anxious 
Basic | or nonanxious in varying degrees. 
Once securely ensconced in the “total 
personality” as a fixed constant, it 
was subject to controlled observation 
and discussion, but not to experimental 
manipulation. This approach appears 
to be limited by the effects of Freudian 
psychology on the conceptulization of 


*The authors wish to express their ap- 
preciation to the Division of Counseling of 
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study. They also wish to thank Judith Se- 
govia for her assistance in scoring and re- 
cording test data. A modified form of this 
paper was presented at the APGA National 
Convention in Philadelphia on April 13, 1960. 

* Now at York University, Toronto, Can- 
ada. 
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mensional measuring instruments. 

A multidimensional approach to the 
study of anxiety would appear to be 
more fruitful. Recent studies (Endler, 
Hunt, & Rosenstein, unpublished; 
McKeachie, Pollie, & Speisman, 1955; 
Sarason & Mandler, 1952) which treat 
anxiety as a multidimensional variable 
have led to the development of new 
techniques of measuring anxiety. Sara- 
son and Mandler (1952), for example, 
have hypothesized that anxiety is as- 
sociated with achievement situations 
through learning or conditioning. They 
further suggest that the situations 
which generate this anxiety may or 
may not contain cues which lead to 
the reduction of anxiety. For example, 
& group intelligence test would not 
contain the cues for anxiety reduction 
since the individual has no way of 
knowing whether or not he is perform- 
ing adequately. On the other hand, in 
a classroom situation, cues for anxiety 
reduction would typically be available 
in that the individual knows how ade- 
quately he is performing. 

The findings of McKeachie et al. 
(1955) seem particularly germane to 
this issue. McKeachie gave half of his 
students answer sheets with spaces in 
which they were invited to “feel free 
to comment” on the test questions 
while the other half of the students 
were given standard answer sheets. 
The results showed that students who 
had the opportunity to write comments 
made “reliably” higher scores than 
those who used standard answer sheets. 
The investigator’s belief was that giv- 
ing students an opportunity to com- 
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ment on test questions reduced anxiety 
and its concomitant detrimental ef- 
fects. 

From their theorizing, Sarason and 
Mandler (1952) developed a Test 
Anxiety Questionnaire (TAQ) on at- 
titudes and experiences in three kinds 
of testing situations. These situations 
were: individual intelligence tests, 
group intelligence tests, and course 
examinations. They then proceeded to 
demonstrate that subjects (Ss) who 
scored high on these measures of 
anxiety (HA) obtained significantly 
lower aptitude examination scores and 
thus received lower predicted grade 
averages than did the Ss who scored 
low on the measures of test anxiety 
(LA). Furthermore, the HA group 
earned higher actual grade averages 
than LA Ss (approaching significance, 
p = .08). Note that the predicted 
grade average was based upon group 
intelligence tests, where anxiety re- 
ducing cues were not available, and 
that actual grade average was based 
upon performance in courses where 
such cues were available. 

The Sarason-Mandler Study was 
done at a large, privately endowed 
university. Since the findings specifi- 
cally indicate socioeconomic correlates 
or determinants of anxiety associated 
with academic achievement, it would 
seem desirable to determine whether or 
not these findings can be generalized 
to other samples. The purpose of this 
study, therefore, was to do a partial 
replication of the Sarason-Mandler 
Study, in order to test the following 
hypotheses for a different sample: 

Hypothesis I: High anxious Ss ob- 
tain significantly lower aptitude scores 
than low anxious Ss. 

Hypothesis II: High anxious Ss 
obtain significantly higher semester 
grade point averages than low anxious 
Ss. 

Hypothesis III: High anxious Ss 
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obtain significantly higher cumulative 
grade point averages than low anxious 
Ss. 

Another purpose of this experiment 
was to study the interrelationships be- 
tween anxiety, aptitude, and achieve- 
ment using simple linear product- 
moment correlational procedures.’ 
Specific hypotheses tested were: 

Hypothesis IV: A significant, nega- 
tive correlation exists between test 
anxiety scores and aptitude test scores. 

Hypothesis V: A significant, posi- 
tive correlation exists between test 
anxiety scores and actual grade point 
averages. 

Hypothesis VI: The inclusion of test 
anxiety scores in a multiple regression 
equation along with aptitude test 
scores yields a significantly more ac- 
curate prediction of academic achieve- 
ment than do aptitude scores alone. 


METHOD 


Subjects 


The Ss were 116 male students enrolled in 
introductory psychology courses at the Penn- 
sylvania State University in the fall semester 
of 1958. Twenty-five Ss were discarded be- 
cause of incomplete data, leaving a sample 
of 91. The age range was 18 to 32, and the 
range in semester standing was 1 to 8, with 
most of the Ss (82%) being sophomores or 
juniors. 


Procedure 


The Mandler-Sarason TAQ (split-half re- 
liability 91) and a Questionnaire on Adult 
Forms of Anxiety and Worry (Sarason & 
Gordon, 1953) (hereafter known as the Gen- 
eralized Anxiety Questionnaire or GAQ) were 
administered to the Ss shortly after the 
semester began. Ss’ responses on the TAQ 
were scored in accordance with a revised 
method reported by Mandler and Cowan 
(1958). 

Upon entering the university students are 
administered a battery of tests including the 
Pennsylvania State University Academic 





* All correlations were computed on the 
IBM Type 650 Data Processing Machine, 
using & program written by J. D. Hall (IBM 
650 Library Program, File Number 6.0.007). 
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ANXIETY AND ACADEMIC ACHIEVEMENT 


Aptitude Examination (PSAE). The total 
score on this examination was used as the 
aptitude score. A student’s predicted grade 
point average (PGPA) is routinely derived 
through multiple regression equations utiliz- 
ing his high school grades and his scores on 
the PSAE. 

The student’s Semester Average was an 
average of all the grades earned during the 
semester in which the anxiety questionnaires 
were administered. His Cumulative Average 
was an average for all work done by the 
student through the end of the semester in 
which he completed the anxiety question- 
naires. 


RESULTS 

On the basis of their TAQ scores, 
the 91 Ss were divided into three 
groups, high anxiety (HA, N = 22), 
medium anxiety (MA, N = 47), and 
low anxiety (LA, N = 22). The HA 
and LA groups included the upper 
and lower 25% of the sample, re- 
spectively. 

A statistical analysis (t¢ tests) com- 
paring the means of HA and LA 
groups* yielded no significant dif- 
ferences between these groups on the 
achievement and aptitude measures 
(see Tables 1 and 2 for the means and 
standard deviations of the various 
measures used). 

The linear intercorrelations between 
the various aptitude, achievement, and 
anxiety measures appear in Table 3. 
There are significant negative correla- 
tions (p < .01) between the anxiety 
tests and the PSAE, and a lack of 
significant relationship between the 
anxiety test and the achievement 
measures (Semester Average and Cu- 
mulative Average). 

* Note that in this study the HA and LA 
groups consisted of the upper and lower 25% 
of the Ss on the TAQ, whereas, Sarason and 
Mandler (1952) used Percentile 86 for the 
HA group cutoff score and Percentile 13 for 
the LA group cutoff score. When they set the 
cutoff score at Percentile 71 for the HA group 
and Percentile 30 for the LA group, Sarason 
and Mandler did not obtain a significant dif- 


ference between HA and LA groups on 
achievement. 
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TABLE 1 
MEANS AND STANDARD DEVIATIONS FOR 
APTITUDE, ACHIEVEMENT, AND ANX- 
1ETY MEASURES FOR TOTAL 
SAMPLE 
(N = 91) 


| Mean SD 


Test Anxiety Question- | 250.28 57.08 








Variable 








naire 
Generalized Anxiety Ques- 7.94 | 4.07 
tionnaire 
Penn State Aptitude Exam | 131.92 | 22.09 
Predicted Grade Point Av- 2.36 | 0.39 
erage 
Semester Average 2.42 | 0.64 
Cumulative Average 2.30 | 0.47 








The multiple correlation between 
TAQ and PGPA as the independent 
or predictor variables and Semester 
Average as the dependent or criterion 
variable yielded a multiple correla- 
tion coefficient of .30 which was ex- 
actly the same as the relationship be- 
tween PGPA and Semester Average. 
Therefore, adding TAQ as a predictor 
variable makes no change in predic- 
tive efficiency. 

Since there were no direct significant 
linear relationships between test anx- 
iety and academic achievement, and 
since the inclusion of test anxiety in 
a multiple regression equation failed 
to increase the predictive efficiency of 
academic achievement, the investiga- 
tors examined the possibility that 
anxiety might be functioning differen- 
tially for various subgroups. Treating 
anxiety as a modifier® variable led 
to differential relationships between 
PGPA and academic achievement (Se- 


* A modifier variable is here defined as an 
independent variable which when dichoto- 
mized or trichotomized leads to differential 
subgroup relationships between a predictor 
variable and a criterion variable. This is not 
to be confused with a moderator variable 
(Saunders 1956) which is a continuous in- 
dependent variable that influences the re- 
lationship between another independent vari- 
able and a dependent variable. 
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TABLE 2 


MEANS AND STANDARD DEVIATIONS FOR APTITUDE AND ACHIEVEMENT MEASURES FOR 
Hicu Anxious (HA), Mepium Anxious (MA), anp Low Anxious (LA) Groups 


(Sorted on Test Anxiety Questionnaire) 



































HA MA LA 
Variable (N = 22) @ = @) phil 
M SD M SD M SD 
Penn State Aptitude Exam 125.45 23.32 131.11 19.52 140.14 23.45 
Predicted Grade Point Av. 2.26 35 2.33 .35 2.52 43 
Semester Average 2.49 .63 2.24 .60 2.73 62 
Cumulative Average 2.29 .45 2.21 41 2.51 55 
TABLE 3 
INTERCORRELATIONS BETWEEN APTITUDE, ANXIETY, AND ACHIEVEMENT 
(N = 91) 
GAQ PSAE PGPA , => 

Test Anxiety Questionnaire (TAQ) .46** — .28** — .24* — .05 —.12 
Generalized Anxiety Questionnaire — .30** — .18 — .13 —.1l 

(GAQ) 
Penn State Aptitude Exam (PSAE) Ry ey ee .o4** 
Predicted Grade Point Average .30** .30** 

(PGPA) 
Semester Average - ~ 

*p< .05. 

**p < 01. 

TABLE 4 between PGPA and Semester Average 


CoRRELATIONS BETWEEN PREDICTED GRADE 
Point AVERAGE SEMESTER AVERAGE 
FOR THE ENTIRE GROUP AND 
AS MopIFIED BY THE TEST 
ANXIETY QUESTIONNAIRE 











r N 
Total Sample .30** 91 
High Anxious (HA) .63** 22 
Medium Anxious (MA) 13 47 
Low Anxious (LA) .19 22 





p< Ol. 


mester Average) for HA, MA, and LA 
groups® (see Table 4). The correlation 


*The notion of dividing an independent 
variable into subgroups in order to make dif- 
ferential predictions is not new. This pro- 
cedure was used quite successfully in a study 
by Frederiksen and Melville (1954) to test 
the relative effects on compulsiveness and 
noncompulsiveness of the relationship be- 
tween interests and achievement. 


for the entire groups of Ss was .30. For 
the HA group there was a significant 
increase in predictability from .30 to 
.63. For the MA and LA groups there 
were nonsignificant decreases from 
.30 to .13 and .19, respectively. 


DIscussION 


The most important finding of the 
present study is the indication that 
anxiety serves as a modifier variable 
which enhances the predictability of 
actual grades from an S’s performance 
on an aptitude examination, but only 
for the HA group. Inclusion of TAQ 
in the linear multiple regression equa- 
tion, however, failed to increase the 
predictability of academic achieve- 
ment. This suggests that it may be 
profitable to use a separate regression 
equation for HA Ss, since evidence 
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does not warrant using such a pro- 
cedure on the MA and LA groups. 

It is possible that a multiple moder- 
ated® regression equation, as suggested 
by Saunders (1956) will improve the 
predictability of academic perform- 
ance. The authors are presently en- 
gaged in a follow-up study which 
will further investigate the role of 
anxiety both as a moderator and as a 
modifier variable. 

There are several possible reasons 
why the present findings did not repli- 
cate those of Sarason and Mandler 
(1952) regarding differences between 
HA and LA Ss. Among these reasons 
are differences in the sample charac- 
teristics. Since Yale is a privately 
endowed institution and can exercise 
greater selectivity in accepting its 
students, it is quite possible that 
there are differences in the socioeco- 
nomic level of the Ss in the two sam- 
ples and that the attrition rate due 
to failure may be lower than at the 
Pennsylvania State University, a pub- 
lic, land grant university. 

Since the present sample consisted 
primarily of sophomores and juniors 
(82%), one would expect a bias on the 
achievement variable. That is, stu- 
dents with poor averages would have 
been dropped or withdrawn from the 
university prior to entering the sopho- 
more year. There is evidence for this 
expectation in that the semester aver- 
age for this sample was 2.42, whereas 
the semester average for the entire 
university was 2.33 and for the fresh- 
man class, the semester average was 
2.07. Further evidence of a possible 
sample bias is the fact that the TAQ 
mean of this sample (250.28) is ap- 
proximately one standard deviation 
below the theoretical mean (304.00). 
If this, in fact, reflects a sample bias, 
Ss classified as HA in this sample 
might be classified MA in the total 
college population. The fact that the 
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observed sample biases are in the di- 
rection of an elevated semester average 
and a depressed TAQ score suggests 
the possibility of a negative relation- 
ship between TAQ and achievement. 

The authors have noted the possi- 
bility of a curvilinear relationship be- 
tween test anxiety and achievement. 
The regression of achievement on test 
anxiety yielded an eta correlation co- 
efficient of .39, which though not quite 
significant, suggests a trend. 

A study now in progress, which con- 
trols for sample biases is further in- 
vestigating, (a) the role of test anx- 
iety as a variable which modifies or 
moderates the relationship between 
grade prediction and grade achieve- 
ment, and (b) the possibility of a non- 
linear relationship between test anx- 
iety and academic achievement. 


SUMMARY 


Aptitude, anxiety, and academic 
achievement measures were collected 
on 91 male college Ss and the scores 
on these measures were intercorre- 
lated. The Ss were then trichotomized 
into High Anxious (HA) (upper 25%), 
Medium Anxious (MA) (middle 
50%), and Low Anxious (LA) (lower 
25%) on the basis of their Test Anx- 
iety Questionnaire (TAQ) _ scores. 
The experimenters then studied the 
differences and similarities among the 
HA, MA, and LA groups as well as 
their differential contribution to the 
prediction of academic achievement. 
The results suggest the following con- 
clusions: 

1. HA Ss do not differ significantly 
from LA Ss on the aptitude or achieve- 
ment measures used in this study. 

2. There is a significant negative 
correlation between test anxiety scores 
and the measure of aptitude. 

3. There is no direct, significant re- 
lationship between test anxiety and 
academic achievement. 
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4. Test anxiety serves as a modifier 
variable which enhances the predicta- 
bility of actual grade averages from 
aptitude test scores. This is, when Ss 
are trichotomized into HA, MA, and 
LA groups the correlation between 
predicted grade averages and actual 
grade averages increases from a .30 for 
the total group to a .63 for the HA 
subgroup, and decreases from a .30 for 
the total group to a .13 and a .19 for 
the MA and LA subgroups, respec- 
tively. This suggests that it may be 
profitable to use a separate regression 
equation for HA Ss. At present such 
a procedure is not warranted for MA 
and LA groups. 

Studies now in progress are further 
investigating some of the implications 
of these findings. 


ROBERT R. GROOMS AND NORMAN S. ENDLER 
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RESPONSES OF RETARDED CHILDREN TO THE 


Brae: CHILDREN’S MANIFEST ANXIETY SCALE! 
a, 34, LESLIE F. MALPASS*, SYLVIA MARK 

ISMAN, Southern Illinois University 

Nn ex- 

jchol., anp DAVID 8S. PALERMO 

anxiety University of Minnesota 


, 1958, 
The Manifest Anxiety Scale (Tay- 
e test | jor, 1953) has been revised for normal 


-—S children in the fourth, fifth, and sixth 
, cor. | grades with CAs of approximately 10- 


.. soc, | 12 years. The children’s scale (CMAS) 
has been found to be reliable (Casta- 
n pre- | neda, McCandless, & Palermo, 1956; 
1068, Palermo, 1959), but one investigation 
suggests it is not highly related to a 
“clinical” concept of anxiety (Wirt 
& Broen, 1956). Although r’s between 
scores on the CMAS and the Otis 
Mental Ability Test were found for 
sixth grade girls but not for boys (Mc- 
Candless & Castaneda, 1956), no stud- 
ies relating to CMAS and IQ scores 
of retarded children have been re- 
ported. 

One purpose of the present study 
was to see whether the CMAS would 
differentiate groups of educable men- 
tally handicapped (EMH) and insti- 
tutionalized retarded children. It was 
expected that both groups of retarded 
children would have higher CMAS 
scores than normals. Another purpose 
of the study was to obtain additional 
information about the relationships 
between CMAS and IQ for both groups 
of retarded children and the normal 
children, since such data are meager. 


METHOD 


Subjects. Forty-one children enrolled in 
five classes for the educable mentally handi- 





* This study is part of a research project 
sponsored by the United States Office of Edu- 
cation Cooperative Research Program (Con- 
tract No. 176.6471). 

? Now at University of South Florida. 
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capped (EMH group) were compared with a 
sample of 53 institutionalized children of 
similar age and IQ from the Lincoln (lIlli- 
nois) State School.* All the EMH children 
were living in their own homes, while the in- 
stitutionalized retardates had resided at the 
Lincoln School for at least one year. The 
CMAS scores of the total group of 94 re- 
tardates were compared with those of a group 
of 63 normal boys and girls of similar age 
from the same schools as the EMH group. 
The retardates’ IQs were obtained by using 
the Wechsler Intelligence Scale for Chil- 
dren, and the normals’ IQs were obtained 
by using the California Test of Mental Ma- 
turity. The Full Scale IQs were used for both 
groups. 

The ages and IQs of the groups, broken 
down by sex, are given in Table 1. 

Procedures. Since none of the retarded 
children could read well enough to complete 
the scale under standard procedures, admin- 
istration of the items were altered so that EZ 
read the 53 statements (42 anxiety scale items 
and 11 lie scale items) to each S, recording 
each Yes-No response as it was given. A 
check on the attention of the retardates was 
made every six to eight items by asking 
them to give an example of that item. Since 
appropriate responses were given in practi- 
cally every instance, it was presumed that the 
retardates understood the statements. Nor- 
mal Ss took the test under the standard con- 
ditions of reading the items and recording 
their own Yes-No responses on the test 
blanks. 

The ¢ tests for independent samples were 
computed for both Anxiety scale (A scale) 
and Lie scale (LZ scale) items to determine 
whether differences existed between the two 





*Grateful acknowledgement is made to 
the teachers in Murphysboro, Christopher, 


Herrin, Carbondale, and Marion, Illinois, and 


to Joseph Albaum, Superintendent, and Wil- 
liam Chambers, Chief Psychologist, Lincoln 
State School and Colony, for their assistance 
in the data collection phase of the study. 
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TABLE 1 


Mean CAs anv IQs* or MENTALLY RE- 
TARDED AND NORMAL GROUPS 

















Group | N |Mean CA! Mean IQ |SD (IQ) 
All EMH Ss_ | 41 | 11.74 67 .80 8.94 
EMH boys 31 | 11.69 68 .67 9.26 


EMH girls 10 | 11.87 | 65.62) 7.92 
.64 


~I 


All Institu- 53 | 11.86 | 62.82 
tional Ss_ | 
Institutional | 32 | 11.78 | 63.76 7.82 
boys 
Institutional | 21 | 11.98 | 61.44 | 7.32 
girls 
All Retarded | 94 | 11.80 | 65.29 | 8.64 
Ss | 


All Normal Ss | 63 | 11.71 | 110.54 | 15.82 
Normal boys | 39 | 11.69 | 111.02 | 14.90 
Normal girls | 24 11.75 | 109.04 | 16.82 





® WISC Full Scale IQ scores used for retarded groups; 
CTMM Full Scale IQ scores used for normal groups. 


groups of retarded children and between the 
total retarded and normal groups. To evalu- 
ate relationships between the CMAS and 
IQ measures, partial correlations, with CA 
held constant, were run between CMAS Anx- 
iety scale items and IQ scores. 


RESULTS AND DISCUSSION 


The means and standard deviations 
for A scale and L scale scores of the 
various groups, and resulting ¢ ratios, 
are given in Table 2. Noninstitu- 
tionalized (EMH) retardates as a 
group gave significantly fewer self- 
reports representing anxiety than the 
combined institutional group. These 
differences seem to be attributable 
mainly to the EMH boys’ scores, 
which were significantly lower than 
the institutional boys’ and both groups 
of retarded girls’ scores. There were 
no significant differences between the 
institutional boys’ and girls’ A scale 
scores or between the EMH and in- 
stitutional girls’ scores, although the 
latter score differences were in the 
same direction as EMH and institu- 
tional boys’ scores. 


L. F. MALPASS, 8. MARK, AND D. 8. PALERMO 


There was no significant difference 
between the scores of normal boys 
and girls. The mean scores for these 
groups are somewhat lower than those 
reported for the original standardiza- 
tion groups (Castaneda et al., 1956) 
and for other groups of normal south- 
ern Illinois children (Palermo, 1959). 
The total normal group obtained strik- 
ingly lower (‘less anxious’) scores 
than the total retarded group, how- 
ever. 

The Lie scale did not differentiate 
any of the groups used in the study 
except the total retarded and total 
normal groups. This suggests there 
is a greater tendency for the retarded 
children to falsify their responses than 
normal children. Within the retarded 
groups, however, mean scores on the 
Lie scale followed the same general 
trends as on the Anxiety scale, i.e., 
EMH boys scored lower than either 
the institutional boys or EMH girls. 
The Lie scale scores for the normals 
are slightly higher (about one point) 
than those reported on the original 
standardization and later standardiza- 
tion groups (Castaneda et al., 1956; 
Palermo, 1959). It is possible that dif- 
ferential administration procedures for 
retardates and normals may have ac- 
counted for some of the observed score 
differences. However, a study by Mal- 
pass (in progress) suggests that the 
amount of difference between individ- 
ual and group administrations proba- 
bly does not exceed 2-3 points.‘ 
Consequently, it can be assumed that 
differences between retarded and nor- 
mal children are due more to Ss’ re- 


*Mean score differences between indi- 
vidual and group presentations for groups of 
21 normal children (IQ range 93-110) and 24 
bright children (IQ range 121-142) were 
2.238 and 1.916, respectively. ¢ ratios were 
1.546 and 2.087, respectively, the latter being 
significant at the 05 level. The 7’s between 
individual and group presentation scores were 
68 for normals and .55 for brights. 
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RETARDED CHILDREN’S RESPONSES TO AN ANXIETY SCALE 





TABLE 2 
DIFFERENCES IN ANXIETY SCALE AND LIE SCALE SCORES BETWEEN 


RETARDED AND NORMAL GROUPS 

















Means SDs ¢ ratios 
Group N ——_——— -~—— ———— - ——-—_— 
A scale L scale A scale L scale A scale L scale 
EMH Retards 41 21.49 5.61 7.24 1.91 2.965** 736 
Institutional Retards 53 25.98 5.96 7.31 2.51 
EMH Boys 31 20.19 5.39 6.85 1.99 3.124** .848 
Institutional Boys 32 25.75 5.81 7.2 | 1.99 
EMH Girls 10 25 .50 6.30 7.31 1.49 .090 .097 
Institutional Girls 21 26 .33 6.19 7.54 3.28 
EMH Boys 31 20.19 5.39 6.85 1.99 | 2.098* | 1.328 
EMH Girls 10 25.50 6.30 7.31 1.49 
Institutional Boys 32 25.75 | 5.81 7.26 1.99 . 280 521 
Institutional Girls 21 26 .33 6.19 7.54 3.28 
Normal Boys 39 13.85 3.08 1.58 2.29 .162 .924 
Normal Girls 24 13.59 3.62 5.61 2.28 
Total Normals 63 13.75 3.29 6.21 2.28 8.933**| 6.771** 
Total Retardates 94 24.02 Ss 7.58 2.29 





* Significant at the .05 level. 
** Significant at the .01 level. 


sponse potential than to method of 
administration. 

Table 3 presents the correlations 
between Anxiety scale and IQ scores 
for the different groups. No significant 
relationships were found between these 
measures. The partial correlation (age 
held constant) between A scale and 
IQ scores for boys (r = —.24) is some- 
what higher than that reported by 
McCandless and Castaneda (1956) for 
sixth grade boys’ scores on the CMAS 
and Otis Quick Scoring Mental Ability 
Test (—.16). However, the r for our 
normal girls’ group (—.21) was lower 
than that for McCandless and Casta- 
neda’s sixth grade girls’ group (—.45). 
The difference between our results and 
those of the previous study may be 
due to the inclusion of some younger 
girls in the present sample. 

Since the Anxiety scale scores are 
not correlated with WISC IQ scores 





TABLE 3 
PARTIAL r’s (AGE HELD CONSTANT) BETWEEN 
CMAS anv IQ Scores or RETARDED 
AND NoRMAL GROUPS 





Group Correlation 
All EMH Ss .025 
EMH boys 066 
EMH girls .059 
All Institutional Ss — .068 
Institutional boys .059 
Institutional girls — .316 
All retarded boys — .089 
All retarded girls — .202 
All retarded Ss — .137 
All Normal Ss — .197 
Normal boys — .235 
Normal girls — .213 





for retarded children, or with CTMM 
scores for normals, it is our conclusion 
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that the manifest anxiety and IQ 
measures were relatively independent. 
However, both the Anxiety scale and 
Lie scale differentiated retarded from 
normal children. Under the adminis- 
tration conditions outlined in this 
study, institutionalized retardates re- 
port more fears and worries (“manifest 
anxiety”) than noninstitutionalized re- 
tardates, and the combined groups of 
retardates report more manifest anx- 
iety than children of normal intelli- 
gence. 


SUMMARY 


It was predicted that a group of in- 
stitutionalized retardates (N = 53) 
would receive higher (“more anxious’’) 
scores on the CMAS than a compara- 
ble group of retarded children attend- 
ing public school (N = 41). Individual 
administrations of the CMAS were 
given all retarded Ss. A control group 
of normal children (N = 63) were 
given the CMAS under standard group 
conditions. The ¢ tests for independent 


samples were used to compare the 
scores of the three groups. In addition, 
partial r’s, holding CA constant, were 
run between CMAS and IQ scores. 
CMAS scores significantly differen- 
tiated educable mentally handicapped 


L. F. MALPASS, 8. MARK, AND D. 8. PALERMO 


from institutionalized retardates, dii- 
ferences being attributed mainly to 
lower (“less anxious’) score of EMH 
boys. Both groups of retarded children 
had significantly higher (“more an- 
xious”) scores than normal children. 
No significant relationships were found 
to exist between manifest anxiety, as 
measured by the CMAS, and intel- 
ligence for educable mentally handi- 
capped, institutionalized retarded, or 
normal children. 
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